< | Main Materials | >
This section will cover: * Regression * ANOVA * Multiple Explanatory Variables * Multiple Explanatory Variables with Interactions * Model Simplification
Linear Regression is a class of Linear Models that is frequently a good choice if both, your response (dependent) and your predictor (independent) variables are continuous. In this section you will learn:
To explore your data in order to determine whether a Linear Model is a good choice by
To fit a Linear Regression Model to data
To determine whether the Model fits adequately your data (its “significance”) by
It is expected that you have already been introduced to, or are familiar with the concepts (and/or theory) underlying Linear Models. If not, you may want to watch the video.
We will use the genome size data.
So,
\(\star\) Open R and setwd to your code directory.
\(\star\) Create a new blank script called Regression.R and add some introductory comments.
\(\star\) Add code to your script to load the genome size data into R and check it (again, using the relative path prefix ../, assuming that you working directory is code):
genome <- read.csv('../data/GenomeSize.csv',stringsAsFactors = T)
head(genome)
## Suborder Family Species GenomeSize GenomeSE GenomeN
## 1 Anisoptera Aeshnidae Aeshna canadensis 2.20 NA 1
## 2 Anisoptera Aeshnidae Aeshna constricta 1.76 0.06 4
## 3 Anisoptera Aeshnidae Aeshna eremita 1.85 NA 1
## 4 Anisoptera Aeshnidae Aeshna tuberculifera 1.78 0.10 2
## 5 Anisoptera Aeshnidae Aeshna umbrosa 2.00 NA 1
## 6 Anisoptera Aeshnidae Aeshna verticalis 1.59 NA 1
## BodyWeight TotalLength HeadLength ThoraxLength AdbdomenLength ForewingLength
## 1 0.159 67.58 6.83 11.81 48.94 45.47
## 2 0.228 71.97 6.84 10.72 54.41 46.00
## 3 0.312 78.80 6.27 16.19 56.33 51.24
## 4 0.218 72.44 6.62 12.53 53.29 49.84
## 5 0.207 73.05 4.92 11.11 57.03 46.51
## 6 0.220 66.25 6.48 11.64 48.13 45.91
## HindwingLength ForewingArea HindwingArea MorphologyN
## 1 45.40 369.57 483.61 2
## 2 45.48 411.15 517.38 3
## 3 49.47 460.72 574.33 1
## 4 48.82 468.74 591.42 2
## 5 45.97 382.48 481.44 1
## 6 44.91 400.40 486.97 1
We can use plot to create a scatterplot between two variables. If you have a set of variables to explore, writing code for each plot is tiresome, so R provides a function pairs, which creates a grid of scatter plots between each pair of variables. This will make a plot off all the different combinations in your dataset to better explore the dataset. All it needs is a dataset.
First, plot all pairs:
library(repr); options(repr.plot.res = 100, repr.plot.width = 10, repr.plot.height = 10)
pairs(genome)
That’s messy! There are too many columns that we are plotting against each other, so its hard to get a meaningful picture (literally). So let’s try color-coding the scatterplot markers by suborder:
pairs(genome, col=genome$Suborder)
The result is still too messy! There are far too many variables in genome for this to be useful. Before we proceed further,
\(\star\) Add pairs(genome, col=genome$Suborder) into your script and run the code:
So, we need to cut down the data to fewer variables. Previously (Experimental design section) we used indices to select colours; here, we can use indices to select columns from the data frame. This again uses square brackets (x[]), but a data frame has two dimensions, rows and columns, so you need to provide an index for each dimension, separated by commas. If an index is left blank, then all of that dimension (i.e. all rows or columns) are selected. Try the following to re-acquaint yourself to access data frame content using indices:
library(repr); options(repr.plot.res = 100, repr.plot.width = 8, repr.plot.height = 8) # change plot size
# create a small data frame:
dat <- data.frame(A = c("a", "b", "c", "d", "e"), B = c(1, 2, 3, 4, 5))
dat[1, ] # select row 1 (all columns selected)
## A B
## 1 a 1
dat[, 2] # select column 2 (all rows selected)
## [1] 1 2 3 4 5
dat[2, 1] # select row 2, column 1
## [1] b
## Levels: a b c d e
Now let’s get resume the actual analysis. We will look at five key variables: genome size, body weight, total length, forewing length and forewing area. If you look at the output of str(genome), you’ll see that these are in columns 4, 7, 8, 12 and 14. We can record the indices of these columns and use this to select the data in the pairs plot:
morpho_vars <- c(4, 7, 8, 12, 14) # store the indices
pairs(genome[, morpho_vars], col = genome$Suborder)
\(\star\) Add the code above to your script and run it.
In the figure above, each scatterplot is shown twice, with the variables swapping between the \(x\) and \(y\) axes. You can see immediately that the relationships between the four morphological measurements and genome size are fairly scattered but that the plots comparing morphology show much clearer relationships.
One way of summarising how strong the pair-wise relationships between these variables is to calculate a correlation coefficient. Pearson correlations look at the difference of each point from the mean of each variable (and since it uses means, it is a parametric statistic).
It is calculated using the differences from the mean on each axis. The key calculation is — for each point – to get the product of the differences on each axis and add them up. If the points are mostly top left (\(-x\), \(y\)) or bottom right (\(x\), \(-y\)) then these products are mostly negative (\(-xy\)); if the points are mostly top right (\(x\), \(y\)) or bottom left (\(-x\), \(-y\)) then the products are mostly positive (\(xy\)).
The plots above show three clear cases where all the values of \(xy\) are negative or positive or where both are present and sum to zero. The Pearson correlation coefficient simply scales these sums of \(xy\) to be between -1 (perfectly negatively correlated) and 1 (perfectly positively correlated) via zero (no correlation). Specifically, this correlation coefficient (\(r\)) is equal to the average product of the standardized values of the two variables (let’s call them \(x\) and \(y\)):
\[r_{xy}={\frac{\sum _{i=1}^{n}\left({\frac {x_{i}-{\bar {x}}}{s_{x}}}\right)\left({\frac {y_{i}-{\bar {y}}}{s_{y}}}\right)}{n-1}}\]
where,
\(n\) is sample size
\(x_i, y_i\) are the individual sample points indexed with \(i\) (\(i = 1,2,\ldots, n\))
${x} = $ is the the sample mean.
$ s_x = $ (sample standard deviation)
The quantities \(\left({\frac {x_{i}-{\bar {x}}}{s_{x}}}\right)\) and \(\left(\frac{y_i-\bar{y}}{s_y} \right)\) are the Z-scores (aka standard scores) of \(x\) and \(y\). This conversion of the raw scores (\(x_i\)’s and \(y_i\)’s) to Z-scores is called standardizing (or sometimes, normalizing).
Thus the coefficient (\(r_{xy}\)) will be positive (negative) if the \(x_i\) and \(y_i\)’s tend to move in the same (opposite) direction relative to their respective means (as illustrated in the figure above).
In R, we will use two functions to look at correlations. The first is cor, which can calculate correlations between pairs of variables, so is a good partner for pairs plots. The second is cor.test, which can only compare a single pair of variables, but uses a \(t\) test to assess whether the correlation is significant.
\(\star\) Try the following (and include it in your R script file):
cor(genome[, morpho_vars], use = "pairwise")
## GenomeSize BodyWeight TotalLength ForewingLength ForewingArea
## GenomeSize 1.0000000 0.3430934 0.3407077 0.2544432 0.3107247
## BodyWeight 0.3430934 1.0000000 0.9167995 0.8944228 0.9198821
## TotalLength 0.3407077 0.9167995 1.0000000 0.9225974 0.9077555
## ForewingLength 0.2544432 0.8944228 0.9225974 1.0000000 0.9829803
## ForewingArea 0.3107247 0.9198821 0.9077555 0.9829803 1.0000000
This is the correlation matrix. Then:
cor.test(genome$GenomeSize, genome$TotalLength, use = "pairwise")
##
## Pearson's product-moment correlation
##
## data: genome$GenomeSize and genome$TotalLength
## t = 3.5507, df = 96, p-value = 0.0005972
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1526035 0.5049895
## sample estimates:
## cor
## 0.3407077
The use='pairwise' tells R to omit observations with missing data and use complete pairs of observations. The first function confirms our impressions from the graphs: the correlations between genome size and morphology are positive but comparatively weak and the correlations between morphological measurements are positive and very strong (i.e. close to 1). The correlation test tells us that genome size and body length are positively correlated (r=0.34, \(t\) = 3.5507, df = 96, \(p\) = 0.0006).
Also, in case you are wondering about what the t-value in the cor.test signifies:
The t-test is used within cor.test to establish if the correlation coefficient is significantly different from zero, that is, to test if there is a significant association between the two variables. The t-test, in general, can be used to test if the mean of a sample is significantly different from some reference value (1-sample t-test). Here, the “mean of the sample” is the observed correlation coefficient, and the reference value is 0 (the null hypothesis, that there is no association).
There is one problem with the correlations above: the correlation coefficient calculation assumes a straight line relationship. Some of the scatterplots above are fairly straight but there are some strongly curved relationships. This is due to allometric scaling, where one body measure changes (or grows) disproportionately with respect to another. Here, two of the variables are in linear units (total and forewing length), one is in squared units (forewing area) and one in cubic units (body weight, which is approximately volume). That these measures are in different units itself guarantees that they will scale allometrically with respect to each other.
The relationships between these variables can be described using a power law:
\[y = ax^b\]
Fortunately, if we log transform this equation, we get \(\log(y) = \log(a) + b \log(x)\). This is the equation of a straight line (\(y=a+bx\)), so we should be able to make these plots straighter by logging both axes. We can create a new logged variable in the data frame like this:
genome$logGS <- log(genome$GenomeSize)
\(\star\) Using this command as a template, create a new logged version of the five variables listed above:
genome$logGS <- log(genome$GenomeSize)
genome$logBW <- log(genome$BodyWeight)
genome$logTL <- log(genome$TotalLength)
genome$logFL <- log(genome$ForewingLength)
genome$logFA <- log(genome$ForewingArea)
\(\star\) Then, using str, work out which column numbers the logged variables are and create a new variable called logmorpho containing these numbers:
str(genome)
## 'data.frame': 100 obs. of 21 variables:
## $ Suborder : Factor w/ 2 levels "Anisoptera","Zygoptera": 1 1 1 1 1 1 1 1 1 1 ...
## $ Family : Factor w/ 9 levels "Aeshnidae","Calopterygidae",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Species : Factor w/ 100 levels "Aeshna canadensis",..: 1 2 3 4 5 6 8 17 46 53 ...
## $ GenomeSize : num 2.2 1.76 1.85 1.78 2 1.59 1.44 1.16 1.44 1.2 ...
## $ GenomeSE : num NA 0.06 NA 0.1 NA NA NA NA NA NA ...
## $ GenomeN : int 1 4 1 2 1 1 1 1 1 1 ...
## $ BodyWeight : num 0.159 0.228 0.312 0.218 0.207 0.22 0.344 0.128 0.392 0.029 ...
## $ TotalLength : num 67.6 72 78.8 72.4 73 ...
## $ HeadLength : num 6.83 6.84 6.27 6.62 4.92 6.48 7.53 5.74 8.05 5.28 ...
## $ ThoraxLength : num 11.8 10.7 16.2 12.5 11.1 ...
## $ AdbdomenLength: num 48.9 54.4 56.3 53.3 57 ...
## $ ForewingLength: num 45.5 46 51.2 49.8 46.5 ...
## $ HindwingLength: num 45.4 45.5 49.5 48.8 46 ...
## $ ForewingArea : num 370 411 461 469 382 ...
## $ HindwingArea : num 484 517 574 591 481 ...
## $ MorphologyN : int 2 3 1 2 1 1 4 1 1 1 ...
## $ logGS : num 0.788 0.565 0.615 0.577 0.693 ...
## $ logBW : num -1.84 -1.48 -1.16 -1.52 -1.58 ...
## $ logTL : num 4.21 4.28 4.37 4.28 4.29 ...
## $ logFL : num 3.82 3.83 3.94 3.91 3.84 ...
## $ logFA : num 5.91 6.02 6.13 6.15 5.95 ...
logmorpho <- c(17,18,19,20,21)
We can now use the pairs and cor test as before for the columns in logmorpho:
pairs(genome[, logmorpho], col=genome$Suborder)
cor(genome[, logmorpho], use='pairwise')
## logGS logBW logTL logFL logFA
## logGS 1.00000000 0.08406293 0.2224443 0.1150025 0.06808306
## logBW 0.08406293 1.00000000 0.8891899 0.9456492 0.94995683
## logTL 0.22244431 0.88918989 1.0000000 0.9157695 0.86207098
## logFL 0.11500250 0.94564919 0.9157695 1.0000000 0.97916470
## logFA 0.06808306 0.94995683 0.8620710 0.9791647 1.00000000
The scatterplots show that logging the data has very successfully addressed curvature (non-linearity) due to allometric scaling between variables in the data.
We’ll now look at fitting the first linear model of this course, to explore whether log genome size explains log body weight. The first thing to do is to plot the data:
myColours <- c('red', 'blue') # Choose two colours
mySymbols <- c(1,3) # And two different markers
colnames(genome)
## [1] "Suborder" "Family" "Species" "GenomeSize"
## [5] "GenomeSE" "GenomeN" "BodyWeight" "TotalLength"
## [9] "HeadLength" "ThoraxLength" "AdbdomenLength" "ForewingLength"
## [13] "HindwingLength" "ForewingArea" "HindwingArea" "MorphologyN"
## [17] "logGS" "logBW" "logTL" "logFL"
## [21] "logFA"
library(repr); options(repr.plot.res = 100, repr.plot.width = 6, repr.plot.height = 6) # change plot size
plot(logBW ~ GenomeSize , data = genome, #Now plot again
col = myColours[Suborder], pch = mySymbols[Suborder],
xlab='Genome size (pg)', ylab='log(Body weight) (mg)')
legend("topleft", legend=levels(genome$Suborder), #Add legend at top left corner
col= myColours, pch = mySymbols)
It is clear that the two suborders have very different relationships: to begin with we will look at dragonflies (Anisoptera). We will calculate two linear models:
The Null Model: This is the simplest linear model: nothing is going on and the response variable just has variation around the mean: \(y = \beta_1\). This is written as an R formula as y ~ 1.
The Linear (Regression) Model: This models a straight line relationship between the response variable and a continuous explanatory variable: \(y= \beta_1 + \beta_{2}x\).
The code below fits these two models.
nullModelDragon <- lm(logBW ~ 1, data = genome, subset = Suborder == "Anisoptera")
genomeSizeModelDragon <- lm(logBW ~ logGS, data = genome, subset = Suborder == "Anisoptera")
mod1, mod2, xxx swiftly get confusing!*\(\star\) Add these model fitting commands into your script and run them.
Now we want to look at the output of the model. Remember from the lecture that a model has coefficients (the \(\beta\) values in the equation of the model) and terms which are the explanatory variables in the model. We’ll look at the coefficients first:
summary(genomeSizeModelDragon)
##
## Call:
## lm(formula = logBW ~ logGS, data = genome, subset = Suborder ==
## "Anisoptera")
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3243 -0.6124 0.0970 0.5194 1.3236
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.39947 0.09085 -26.413 < 2e-16 ***
## logGS 1.00522 0.23975 4.193 9.54e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6966 on 58 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.2326, Adjusted R-squared: 0.2194
## F-statistic: 17.58 on 1 and 58 DF, p-value: 9.539e-05
There is a lot of information there: the model description (‘Call’), a summary of the residuals, a table of coefficients and then information on residual standard error, r squared and an \(F\) test.
All of these will become clearer during this course (in particular, the meaning of Adjusted R-square will be explained in the ANOVA section — for the moment, concentrate on the coefficients table.
There are two rows in the coefficient table, one for each coefficient in \(y=\beta_1 + \beta_2x\) — these are the intercept and the slope of the line. The rest the details on each row are a \(t\) test of whether the slope and intercept are significantly different from zero.
The (least-squares) estimate of the slope coefficient (logGS) is equal to the correlation coefficient between the dependent variable (\(y\), here logBW) and the independent variable (\(x\), here logGS) times the ratio of the (sample) standard deviations of the two (see above for the definitions of these):
\[ \text{Slope} = \beta_2 = r_{xy} \frac{s_y}{s_x}\]
Thus you can see that the regression slope is proportional to the correlation coefficient; the ratio of standard deviations serves to scale the correlation coefficient (which is unit-less) appropriately to the actual units in which the variables are measured.
The (least-squares) estimate of the intercept is the mean of the dependent variable minus the estimated slope coefficient times the mean of the independent variable:
\[ \text{Intercept} = \beta_1 = \bar{y} - \beta_2 \bar{x} \]
The standard error of the model (“Residual standard error” in the output above, also referred to as the standard error of the regression) is equal to the square root of the sum of the squared residuals divided by \(n-2\). The sum of squared residuals is divided by \(n-2\) in this calculation rather than \(n-1\) because an additional degree of freedom for error has been used up by estimating two parameters (a slope and an intercept) rather than only one (the mean) in fitting the model to the data.
As you can see in the output above, each of the two model parameters, the slope and intercept, has its own standard error, and quantifies the uncertainty in these two estimates. These can be used to construct the Confidence Intervals around these estimates, which we will learn about later.
The main take-away from all this is that the standard errors of the coefficients are directly proportional to the standard error of the regression and inversely proportional to the square root of the sample size (\(n\)). Thus, that “noise” in the data (measured by the residual standard error) affects the errors in the coefficient estimates in exactly the same way. Thus, 4 times as much data will tend to reduce the standard errors of the all coefficients by approximately a factor of 2.
Now we will look at the terms of the model using the anova function. We will have a proper look at ANOVA (Analysis of Variance) later.
Meanwhile, recall from your lecture that ANOVA tests how much variation in the response variable is explained by each explanatory variable. We only have one variable and so there is only one row in the output:
anova(genomeSizeModelDragon)
## Analysis of Variance Table
##
## Response: logBW
## Df Sum Sq Mean Sq F value Pr(>F)
## logGS 1 8.5294 8.5294 17.579 9.539e-05 ***
## Residuals 58 28.1417 0.4852
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This table is comparing the variation in log body weight explained by log genome size to the total variation in log body weight. We are interested in how much smaller the residuals are for the genome size model than the null model. Graphically, how much shorter are the red residuals than the blue residuals:
We can get the sums of the squares of these residuals from the two models using the function resid, and then square them and add them up:
sum(resid(nullModelDragon) ^ 2)
## [1] 36.67107
sum(resid(genomeSizeModelDragon) ^ 2)
## [1] 28.14168
So we have five columns in the ANOVA table:
Df: This shows the degrees of freedom. Each fitted parameter/coefficient takes up a degree of freedom from the total sample size, and the left over are the residuals degree of freedom. In this case, genome size adds a slope (compare the null model \(y=\beta_1\) and this model \(y=\beta_1 + \beta_2x\) — there is one more \(\beta\)).
Sum Sq: This shows sums of squares. The bottom line is the residual sum of squares for the model and the one above is the variation explained by genome size. Using the two values from above, the sums of square residuals for the null model are 36.67. In the genome size model, the sum of square residuals are 28.14 and so \(36.67-28.14=8.53\) units of variance have been explained by this model.
Mean Sq: These are just the Sum Sq (Sum of Squares) values divided by the degrees of freedom. The idea behind this is simple: if we explain lots of variation with one coefficient, that is good (the null model), and if we explain a small amount of variation with a loss of degree of freedom (by adding and then estimating more parameters), then that is bad.
F value: This is the ratio of the Mean Sq for the variable and the residual Mean Sq. This is used to test whether the explained variation is large or small.
Pr(>F): This is the \(p\) value — the probability of the x-variable (the fitted model) explaining this much variance by chance.
In this case, it is clear that genome size explains a significant variation in body weight.
\(\star\) Include the summary and anova commands for genomeSizeModelDragon in your script, run them and check you are happy with the output.
Using the above code as a template, create a new model called genomeSizeModelDamsel that fits log body weight as a function of log genome size for damselflies. Write and run code to get the summary and anova tables for this new model. For example, the first step would be:
genomeSizeModelDamsel <- lm(logBW ~ logGS, data=genome,subset=Suborder=='Zygoptera')
Now we can plot the data and add lines to show the models. For simple regression models, we can use the function abline(modelName) to add a line based on the model.
\(\star\) Create a scatterplot of log body weight as a function of log genome size, picking your favourite colours for the points.
Use abline to add a line for each model and use the col option in the function to colour each line to match the points.
For example: abline(genomeSizeModelDragon, col='red').
myCol <- c('red','blue')
library(repr); options(repr.plot.res = 100, repr.plot.width = 8, repr.plot.height = 8) # change plot size
plot(logBW ~ logGS, data=genome, col=myCol[Suborder], xlab='log Genome Size (pg)', ylab='log Body Weight (g)')
abline(genomeSizeModelDragon, col='red')
Your final figure should look something like this:
Now that we have our models, we need to check that they are appropriate for the data. For this, we will inspect “diagnostic plots”. Producing diagnostic plots is easy in R — if you plot the model object, then R produces a set of diagnostic plots!
\(\star\) Quick note on par: this is used to tell R where you want your plots when you build.
\(\star\) Try the following code (and include in the R script file):
par(mfrow = c(2, 2), mar = c(5, 5, 1.5, 1.5))
plot(genomeSizeModelDragon)
These are the diagnostics for the lm fit to the Dragonfly data subset.
Let’s also plot the diagnostic pots for the model fitted to the Damselfly subset:
par(mfrow = c(2, 2), mar = c(5, 5, 1.5, 1.5))
plot(genomeSizeModelDamsel)
The diagnostic plots are:
Residuals vs Fitted: This plot is used to spot if the distribution of the residuals (the vertical distance from a point to the regression line) has similar variance for different predicted values (the y-value on the line corresponding to each x-value). There should be no obvious patterns (such as curves) or big gaps. If there was no scatter, if all the points fell exactly on the line, then all of the dots on this plot would lie on the gray horizontal dashed line. The red line is a smoothed curve to make it easier to see trends in the residuals. It is flat in the Dragonfly model fit above, and a bit more wavy than we would like in the in the Damselfly model fit, but there are no clear trends in either, which is what you hope to see.
Normal Q–Q: This plot is to check whether the residuals are normally distributed — are the values of the observed residuals similar to those expected under a normal distribution? Ideally, the points should form a perfectly straight line, indicating that the observed residuals exactly match the expected. Here, note that the points lie pretty close to the dashed line in both sets of diagnostic Figures above, but deviate at the ends, especially for Damselflies. However, some deviation is to be expected near the ends — here these deviations are just about acceptable.
Scale–Location: The x-axis on this plot is identical to the Residuals vs Fitted plot – these are the fitted values. The y-axis is the square root of the standardized residuals, which are residuals rescaled so that they have a mean of zero and a variance of one. As a result, all y-axis values are positive. Thus large residuals (both positive and negative) plot at the top, and small residuals plot at the bottom (so only their scale is retained). Thus, all of the numbered points (which will be the same in all plots) plot at the top here. The red line here shows the trend, just like the Residuals vs Fitted plot. The regression analysis has assumed homoscedasticity, that the variance in the residuals doesn’t change as a function of the predictor. If that assumption is correct, the red line should be relatively flat. It is not quite as flat as we would like, especially for the Dragonfly analysis.
Residuals vs Leverage: This plot shows the standardized residuals against leverage. “Leverage” is a measure of how much each data point influences the linear model’s coefficient estimates. Because the regression line must pass through the centroid (“pivot point”) of the data, points that lie far from the centroid have greater leverage, and their leverage increases if there are fewer points nearby. Here is an illustration:
There are two key things to note about this plot:
The standardized residuals (y-axis) are centered around zero and reach 2-3 standard deviations away from zero. They should also lie symmetrically about zero, as would be expected for a normal distribution. This is the case for the Damselfly plot , but not so much for the Dragonfly plot.
The contours values show Cook’s distance (only visible in the Damsefly plot), which measures how much the regression would change if a point was deleted. Cook’s distance is increased by leverage and by large residuals: a point far from the centroid with a large residual can severely distort the coefficient estimates from the regression. On this plot, you want to see that the red smoothed line stays close to the horizontal gray dashed line and that no points have a large Cook’s distance (i.e, >0.5). Both are true here.
This is an important diagnostic plot in regression analyses in particular because it tells you whether your estimate of the slope coefficient in particular is strongly affected by certain data points.
Note that certain points are numbered in all the diagnostic plots — these are points to pay special attention to because they are potential outliers. The numbers correspond to the row number for that dataset in your data frame. You can easily identify these points in your data plot because the order of the points along the fitted values axis (y-axis) in the diagnostic plot matches the order along the x-axis in the data plot. So, for example here, in the dragonfly diagnostic plots the two numbered points (46, 10) near the bottom correspond in the data plot to the two red points near the center-left that lie farthest below the red line (see the plot with regression lines fitted to the data).
Thus, neither the Drangonfly nor the Damselfly diagnostic plots look perfect, but this level of deviation from assumptions of linear models is acceptable. The main worrying factors are that the Q-Q plot for Damselflies indicates the observed residuals are a bit more extreme than expected, and the Scale–Location plot for Dragonflies suggests some pattern in the standardized residuals wrt location of the fitted values.
\(\star\) Copy the code to create the diagnostic plots into your script to keep a record of the code and run it.
Now we know that the models are appropriate and we have a plot, the last thing is to report the statistics. For the damselfly model, here is one summary that would do: log genome size explains significant variation in log body weight in dameselflies (F=10.5, df=1,36, p=0.0025) and shows that body weight decreases with genome size (intercept: -4.65, se=0.09; slope: -1.14, se=0.35).
Analysis of Variance, is very often a good choice if your response (dependent) variable is continuous, and your predictor (independent) variables is categorical.
In this section, you will learn to perform an ANOVA, that is, fit this linear model to the data. Specifically, you will learn to\(^{[1]}\):
Visualize the data by plotting boxplots and barplots
Fit an ANOVA to test whether certain factors can explain (partition) the variation in your data
Perform diagnostics to determine whether the factors are explanatory, and whether the Linear Model is appropriate for your data
Explore and compare how much the different levels of a factor explain the variation in your data
Analysis Of Variance (ANOVA) is an extremely useful class of Linear models. It is very often appropriate when your response (dependent) variable is continuous, while your predictor (independent) variable is categorical. Specifically, One-way ANOVA is used to compare means of two or more samples representing numerical, continuous data in response to a single categorical variable (factor).
(One-way) ANOVA tests the null hypothesis that samples from two or more groups (the treatments or factors) are drawn from populations with the same mean value. That is, the null hypothesis is that all the groups are random samples from the same population (no statistical difference in their means). To do this, ANOVA compares the variance in the data explained by fitting the linear model, to the unexplained variance (the null hypothesis.
In other words, in effect, ANOVA asks whether a linear model with a predictor (or explanatory variable) with at least two categorical levels (or factors), better accounts for the variance (Explained Sum of Squares, ESS, see below) than a null model of the form \(y = \beta_1\). Thus, ANOVA is just a type of linear model.
By the end of this section, it will also make more sense to you how/why fitting a linear regression model to the data of the form \(y = \beta_1 + \beta_2 x\) (where \(x\) is a continuous predictor variable), requires an ANOVA to determine if the model better fits than a null model of the form \(y = \beta_1\).
Typically, one-way ANOVA is used to test for differences among at least three groups, since the two-group (or levels or factors) case can be covered by a \(t\)-test. When there are only two means to compare, the \(t\)-test and the F-test are equivalent; the relation between ANOVA and t is given by \(F = t^2\).
An extension of one-way ANOVA is two-way analysis of variance that examines the influence of two different categorical independent variables on one dependent variable. We will explore multiple predictors in later segments.
ANOVA uses the F-Statistic. To this end, an ANOVA “partitions” variability in your data as follows:
Total sum of squares (TSS): This is the sum of the squared difference between the observed dependent variable (\(y\)) and the mean of the response variable \(y\) (denoted by \(\bar{y}\)), i.e.,
\[\text{TSS} = \sum_{i=1}^{n}(y_i - \bar{y})^2\]
TSS tells us how much variation there is in the dependent variable without having any other information (your null model). You might notice that TSS is the numerator of the sample variance (or it’s square-root, the sample standard deviation).
Explained sum of squares (ESS): Sum of the squared differences between the predicted \(y\)’s (denoted \(\hat{y}\)’s) and \(\bar{y}\), or, \[\text{ESS} = \sum_{i=1}^{n} (\hat{y}_i - \bar{y})^2\] ESS tells us how much of the variation in the dependent variable our alternative (linear) model was able to explain. That is, it’s the reduction in uncertainty that occurs when the linear model is used to predict the responses.
Residual sum of squares (RSS): Sum of the squared differences between the observed \(y\)’s (denoted by \(y_i\)) and the predicted \(\hat{y}\), or, \[\text{RSS} = \sum_{i=1}^{n} (\hat{y}_i - y_i)^2\] RSS tells us how much of the variation in the dependent variable our model could not explain. That is, it’s the uncertainty that remains even after the linear model is used. The linear model is considered to be statistically significant if it can account for a large amount of variability in the response.
The sums of squares used to calculate the statistical significance of the linear model (Regression, ANOVA, etc) through the F value are as follows:
| Type of Sum of Squares (SS) | SS Calculation | Degrees of Freedom (DF) | Mean Sum of Squares (MSS) |
|---|---|---|---|
| TSS | \(\sum_{i=1}^{n}(y_i - \bar{y})^2\) | \(n-1\) | \(\frac{TSS}{n-1}\) |
| ESS | \(\sum_{i=1}^{n} (\hat{y}_i - \bar{y})^2\) | \(n_c-1\) | \(\frac{ESS}{n_c-1}\) |
| RSS | \(\sum_{i=1}^{n} (\hat{y}_i - y_i)^2\) | \(n-n_c\) | \(\frac{RSS}{n-n_c}\) |
Let’s try to make sense of these calculations. Firstly, because we are dealing with samples understanding the degrees of freedom is critical.
Each sum of squares has a corresponding degrees of freedom (DF) associated with it that gives the Mean Sum of Squares (MSS) — the Sums of Squares divided by the corresponding degrees of freedom.
The TSS DF is one less than the number of observations \(n-1\). This is because calculating TSS, needs \(\bar y\) , which imposes loss of one degree of freedom. Note that MSS is thus nothing but the sample variance.
The ESS DF is one less than the number of coefficients (\(n_c\)) (estimated parameters) in the model: \(n_c-1\). Note that in the case where the linear model is an ANOVA, the number of coefficients equals the number of “treatments” (the categories or levels in the predictor or factor). So for example, in Figure 1, there are three treatments (predictors) and therefore three coefficients (\(\beta_1\), \(\beta_2\), \(\beta_3\)), which means that the ESS degrees of freedom there is \(n_c-1 = 2\).
The RSS DF is the sample size \(n\) minus the number of coefficients that you need to estimate (\(n_c\)), that is, \(n - n_c\), because each estimated coefficient is an unknown parameter.
The F-Value or F-Ratio, the test statistic used to decide whether the linear model fit is statistically significant, is the ratio of the Mean ESS to the Mean RSS:
\[ F = \frac{\left(\frac{ESS}{n_c-1}\right)}{\left(\frac{RSS}{n-n_c}\right)} \]
If the null hypothesis that the group means are drawn from sub-populations with the same mean were indeed true, the between-group variance (numerator in this F-ratio) should be lower than the within-group variance (denominator). The null hypothesis is rejected if this F-Ratio is large — the model explains a significant amount of variance, and we can conclude that the samples were drawn from populations with different mean values.
The p-value is calculated for the overall model fit using the F-distribution.
Also note that the Root Mean Square Error (RMSE), also known as the standard error of the estimate, is the square root of the Mean RSS. It is the standard deviation of the data about the Linear model, rather than about the sample mean.
The \(R^{2}\), also called the Coefficient of Determination, is the proportion of total error (TSS) explained by the model (ESS), so the ratio ESS/TSS. That is it is the proportion of the variability in the response that is explained by the fitted model. Since TSS = ESS + RSS, \(R^{2}\) can be rewritten as (TSS-RSS)/TSS = 1 - RSS/TSS. If a model perfectly fits the data, \(R^{2}=1\), and if it has no predictive capability \(R^{2}=0\). In reality, \(R^{2}\) will never be exactly 0 because even a null model will explain some variance just by chance due to sampling error. Note that \(R\), the square root of \(R^2\), is the multiple correlation coefficient: the correlation between the observed values (\(y\)), and the predicted values (\(\hat{y}\)).
As additional predictors (and therefore linear model coefficients) are added to a linear model, \(R^2\) increases even when the new predictors add no real predictive capability. The adjusted-\(R^2\) tries to address this problem of over-specification or over-fitting by including the degrees of freedom: Adjusted \(R^2\) = 1 - (RSS/\(n-n_c-2\))/(TSS/\(n-1\)) \(^{[2]}\).
Thus, additional predictors with little explanatory capability will increase the ESS (and reduce the RSS), but they will also have lower RSS degrees of freedom (because of the additional number of fitted coefficients, \(n_c\)’s)\(^{[3]}\). Thus if the additional predictors have poor predictive capability, these two reductions will cancel each other out. In other words, the Adjusted \(R^2\) penalizes the addition of new predictors to the linear model, so you should always have a look at the Adjusted \(R^2\) as a corrected measure of \(R^2\).
In this section, we will use a dataset from a recent paper on the effects of temperature and larval resource supply on Aedes aegypti life history traits and the vector’s maximal population growth rate (Huxley et al. 2021).
\(\star\) Download the file traitdata_Huxleyetal_2021.csv and save to your Data directory.
\(\star\) Create a new blank script called ANOVA_Prac.R and add some introductory comments.
\(\star\) Use read.csv to load the data in the data frame mozwing and then str and summary to examine the data:
require('tidyverse')
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ dplyr 1.0.6
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
require('gplots')
## Loading required package: gplots
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
require('repr')
rm(list=ls())
graphics.off()
mozwing <- read.csv('../data/traitdata_Huxleyetal_2021.csv', stringsAsFactors = T)
mozwing$temp <- as_factor(mozwing$temp) # define temperature and food level as categorical
mozwing$food_level <- as_factor(mozwing$food_level)
str(mozwing)
## 'data.frame': 270 obs. of 33 variables:
## $ ID : int 4 6 7 11 11 12 19 20 16 21 ...
## $ exp_no : int 3 3 3 3 3 3 3 3 3 3 ...
## $ rep : Factor w/ 3 levels "A","B","C": 1 1 1 1 3 1 1 1 3 1 ...
## $ dens : num 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
## $ rearing_vessel : Factor w/ 1 level "tub": 1 1 1 1 1 1 1 1 1 1 ...
## $ temp : Factor w/ 3 levels "22","26","32": 1 1 1 1 1 1 1 1 1 1 ...
## $ food_level : Factor w/ 2 levels "0.1","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ treatment : Factor w/ 6 levels "22_0.1","22_1",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ feeding_interval : int 24 24 24 24 24 24 24 24 24 24 ...
## $ egg_sub : Factor w/ 2 levels "29/03/2019 12:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ l_emerg : Factor w/ 2 levels "30/03/2019 12:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ l_emerg_days_frm_sub : int 1 1 1 1 1 1 1 1 1 1 ...
## $ l_death : Factor w/ 29 levels "01/04/2019 12:00",..: NA NA NA NA NA NA NA NA NA NA ...
## $ l_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ pupal_ref : int 1 3 4 7 2 8 12 13 6 14 ...
## $ p_emerg : Factor w/ 27 levels "01/05/2019 12:00",..: 24 25 26 3 26 3 5 7 27 7 ...
## $ l_to_p_devtime : int 28 29 30 34 30 34 35 36 32 36 ...
## $ l_to_p_devrate : num 0.0357 0.0345 0.0333 0.0294 0.0333 ...
## $ p_death : Factor w/ 8 levels "03/04/2019 12:00",..: NA NA NA NA NA NA NA NA NA NA ...
## $ p_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ a_emerg : Factor w/ 30 levels "01/05/2019 12:00",..: 29 30 1 7 1 7 9 13 3 11 ...
## $ p_to_a_devtime : int 3 3 3 3 3 3 3 4 3 3 ...
## $ p_to_a_devrate : num 0.333 0.333 0.333 0.333 0.333 ...
## $ a_death : Factor w/ 39 levels "01/05/2019 12:00",..: 9 5 7 15 5 15 23 12 14 21 ...
## $ adult_lifespan : int 8 5 5 6 4 6 9 1 7 7 ...
## $ adult_mort_rate : num 0.125 0.2 0.2 0.167 0.25 ...
## $ hatch_to_a_devtime : int 31 32 33 37 33 37 38 40 35 39 ...
## $ hatch_to_adult_devrate: num 0.0323 0.0312 0.0303 0.027 0.0303 ...
## $ sex : Factor w/ 1 level "female": 1 1 1 1 1 1 1 1 1 1 ...
## $ im_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ j_lifetime : int 31 32 33 37 33 37 38 40 35 39 ...
## $ total_lifespan : int 39 37 38 43 37 43 47 41 42 46 ...
## $ length_mm : num 2.93 2.59 2.41 2.62 2.28 2.39 2.99 2.5 2.75 2.82 ...
There are 33 variables but only a subset of these are of interest to us here. Let’s summarize the data to explore further:
summary(mozwing)
## ID exp_no rep dens rearing_vessel temp
## Min. : 1.00 Min. :3 A:87 Min. :0.2 tub:270 22:90
## 1st Qu.: 9.00 1st Qu.:3 B:87 1st Qu.:0.2 26:90
## Median :17.00 Median :3 C:96 Median :0.2 32:90
## Mean :16.07 Mean :3 Mean :0.2
## 3rd Qu.:24.00 3rd Qu.:3 3rd Qu.:0.2
## Max. :30.00 Max. :3 Max. :0.2
##
## food_level treatment feeding_interval egg_sub
## 0.1:135 22_0.1:45 Min. :24 29/03/2019 12:00:153
## 1 :135 22_1 :45 1st Qu.:24 30/03/2019 12:00:117
## 26_0.1:45 Median :24
## 26_1 :45 Mean :24
## 32_0.1:45 3rd Qu.:24
## 32_1 :45 Max. :24
##
## l_emerg l_emerg_days_frm_sub l_death
## 30/03/2019 12:00:153 Min. :1 31/03/2019 12:00: 35
## 31/03/2019 12:00:117 1st Qu.:1 01/04/2019 12:00: 7
## Median :1 15/05/2019 12:00: 6
## Mean :1 18/04/2019 12:00: 5
## 3rd Qu.:1 14/04/2019 12:00: 4
## Max. :1 (Other) : 46
## NA's :167
## l_surv pupal_ref p_emerg l_to_p_devtime
## Min. :0.0000 Min. : 1.00 03/04/2019 12:00: 26 Min. : 3.0
## 1st Qu.:0.0000 1st Qu.: 6.00 04/04/2019 12:00: 22 1st Qu.: 6.0
## Median :0.0000 Median :11.00 07/04/2019 12:00: 22 Median : 9.0
## Mean :0.3815 Mean :11.65 05/04/2019 12:00: 13 Mean :14.9
## 3rd Qu.:1.0000 3rd Qu.:17.00 06/04/2019 12:00: 10 3rd Qu.:26.0
## Max. :1.0000 Max. :27.00 (Other) : 74 Max. :38.0
## NA's :103 NA's :103 NA's :103
## l_to_p_devrate p_death p_surv
## Min. :0.02632 05/04/2019 12:00: 3 Min. :0.00000
## 1st Qu.:0.03846 08/04/2019 12:00: 2 1st Qu.:0.00000
## Median :0.11111 03/04/2019 12:00: 1 Median :0.00000
## Mean :0.11582 06/04/2019 12:00: 1 Mean :0.06587
## 3rd Qu.:0.16667 07/05/2019 12:00: 1 3rd Qu.:0.00000
## Max. :0.33333 (Other) : 3 Max. :1.00000
## NA's :103 NA's :259 NA's :103
## a_emerg p_to_a_devtime p_to_a_devrate a_death
## 05/04/2019 12:00: 20 Min. :1.000 Min. :0.2500 07/04/2019 12:00: 11
## 10/04/2019 12:00: 20 1st Qu.:2.000 1st Qu.:0.3333 12/04/2019 12:00: 11
## 06/04/2019 12:00: 18 Median :2.000 Median :0.5000 13/04/2019 12:00: 10
## 07/04/2019 12:00: 10 Mean :2.436 Mean :0.4605 02/05/2019 12:00: 9
## 09/04/2019 12:00: 10 3rd Qu.:3.000 3rd Qu.:0.5000 08/04/2019 12:00: 8
## (Other) : 78 Max. :4.000 Max. :1.0000 (Other) :107
## NA's :114 NA's :114 NA's :114 NA's :114
## adult_lifespan adult_mort_rate hatch_to_a_devtime hatch_to_adult_devrate
## Min. : 1.00 Min. :0.06667 Min. : 5.00 Min. :0.02439
## 1st Qu.: 4.00 1st Qu.:0.11111 1st Qu.: 8.00 1st Qu.:0.03571
## Median : 6.00 Median :0.16667 Median :12.00 Median :0.08333
## Mean : 6.59 Mean :0.23194 Mean :17.44 Mean :0.08661
## 3rd Qu.: 9.00 3rd Qu.:0.25000 3rd Qu.:28.00 3rd Qu.:0.12500
## Max. :15.00 Max. :1.00000 Max. :41.00 Max. :0.20000
## NA's :114 NA's :114 NA's :114 NA's :114
## sex im_surv j_lifetime total_lifespan length_mm
## female:156 Min. :0.0000 Min. : 1.00 Min. : 1.00 Min. :2.030
## NA's :114 1st Qu.:0.0000 1st Qu.: 6.00 1st Qu.: 8.00 1st Qu.:2.510
## Median :0.0000 Median :12.00 Median :18.00 Median :2.730
## Mean :0.4222 Mean :15.97 Mean :19.78 Mean :2.742
## 3rd Qu.:1.0000 3rd Qu.:27.00 3rd Qu.:30.00 3rd Qu.:3.020
## Max. :1.0000 Max. :47.00 Max. :49.00 Max. :3.270
## NA's :121
You will see from the output of summary that there are data on several vector traits, such as juvenile development, adult longevity and wing length.
The body size of arthropod vectors is expected to have an important effect on vector-borne disease transmission risk, because it is associated with traits, such as longevity and biting rate. In organisms with complex life histories, including mosquitoes, adult size is strongly affected by larval rearing conditions.
Here, we are interested in finding out whether wing length varies predictably across temperatures (a typical one-way ANOVA question).
\(\star\) Generate a boxplot of the differences in wing length between temperatures:
plot(length_mm ~ temp, mozwing)
Looking at the plots, it is clear that there is more spread in the data below the median than above. Create a new variable logwinglength in the mozwing data frame containing logged millimeter values.
\(\star\) Now create a boxplot of log wing length values within temperatures:
mozwing$loglength <- log(mozwing$length_mm)
Box and whisker plots show the median and spread in the data very clearly, but we want to test whether the means are different. This is \(t\) test territory — how different are the means given the standard error — so it is common to use barplots and standard error bars to show these differences.
We’re going to use some R code to construct a barplot by hand. We need to calculate the means and standard errors within temperature groups, but before we can do that, we need a new functions to calculate the standard error of a mean:
seMean <- function(x){ # get standard error of the mean from a set of values (x)
x <- na.omit(x) # get rid of missing values
se <- sqrt(var(x)/length(x)) # calculate the standard error
return(se) # tell the function to return the standard error
}
Now we can use the function tapply: it splits a variable up into groups from a factor and calculates statistics on each group using a function.
lengthMeans <- tapply(mozwing$loglength, mozwing$temp, FUN = mean, na.rm = TRUE)
print(lengthMeans)
## 22 26 32
## 1.0737861 0.9824950 0.9131609
And similarly, let’s calculate the standard error of the mean using the function we made:
lengthSE <- tapply(mozwing$loglength, mozwing$temp, FUN = seMean)
print(lengthSE)
## 22 26 32
## 0.01330309 0.01431165 0.01230405
Now we have to put these values together on the plot:
# get the upper and lower limits of the error bars
upperSE <- lengthMeans + lengthSE
lowerSE <- lengthMeans - lengthSE
# get a barplot
# - this function can report where the middle of the bars are on the x-axis
# - set the y axis limits to contain the error bars
barMids <- barplot(lengthMeans, ylim=c(0, max(upperSE)), ylab = 'ln(wing length, mm)')
# Now use the function to add error bars
# - draws arrows between the points (x0,y0) and (x1,y1)
# - arrow heads at each end (code=3) and at 90 degree angles
arrows(barMids, upperSE, barMids, lowerSE, ang=90, code=3)
Now we need to draw all these pieces together into a script and get used to using them.
\(\star\) Add all the lines of code from this section into your script. Run it and check you get the graph above.
\(\star\) Use the second two chunks as a model to plot a similar graph for food level
seMeanfood <- function(x){ # get standard error of the mean from a set of values (x)
x <- na.omit(x) # get rid of missing values
se <- sqrt(var(x)/length(x)) # calculate the standard error
return(se) # tell the function to return the standard error
}
lengthMeansfood <- tapply(mozwing$loglength, mozwing$food_level, FUN = mean, na.rm = TRUE)
print(lengthMeansfood)
## 0.1 1
## 0.9111556 1.0670992
lengthSEfood <- tapply(mozwing$loglength, mozwing$food_level, FUN = seMeanfood)
print(lengthSEfood)
## 0.1 1
## 0.011887620 0.008825873
Lets get the upper and lower limits of the error bars:
upperSEfood <- lengthMeansfood + lengthSEfood
lowerSEfood <- lengthMeansfood - lengthSEfood
Now we should build a barplot - this function can report where the middle of the bars are on the x-axis - set the y axis limits to contain the error bars
barMidsfood <- barplot(lengthMeansfood, ylim=c(0, max(upperSE)), ylab = 'ln(wing length, mm)')
Now use the function to add error bars - draws arrows between the points (x0,y0) and (x1,y1) - arrow heads at each end (code=3) and at 90 degree angles
barMidsfood <- barplot(lengthMeansfood, ylim=c(0, max(upperSE)), ylab = 'ln(wing length, mm)')
arrows(barMidsfood, upperSEfood, barMidsfood, lowerSEfood, ang=90, code=3)
That is a lot of work to go through for a plot. Doing it the hard way uses some useful tricks, but one strength of R is that there is a huge list of add-on packages that you can use to get new functions that other people have written.
We will use the gplots package to create plots of group means and confidence intervals. Rather than plotting the means \(\pm\) 1 standard error, the option p=0.95 uses the standard error and the number of data points to get 95% confidence intervals. The default connect=TRUE option adds a line connecting the means, which isn’t useful here.
\(\star\) Replicate the code below into your script and run it to get the plots below.
#Load the gplots package
library(gplots)
# Get plots of group means and standard errors
par(mfrow=c(1,2))
plotmeans(loglength ~ temp, data=mozwing, p=0.95, connect=FALSE)
plotmeans(loglength ~ food_level, data=mozwing, p=0.95, connect=FALSE)
Hopefully, those plots should convince you that there are differences in wing length across temperatures and food levels. We’ll now use a linear model to test whether those differences are significant.
\(\star\) Using your code from the regression section as a guide, create a linear model called lengthLM which models wing length as a function of temperature.
Use anova and summary to look at the analysis of variance table and then the coefficients of the model.
The ANOVA table for the model should look like the one below: temperature explains highly significant variation in wing length(\(F= 30.5, \textrm{df}=2 \textrm{ and } 146, p =8.45e-12\))
lengthLM <- lm(loglength ~ temp, data=mozwing)
anova(lengthLM)
## Analysis of Variance Table
##
## Response: loglength
## Df Sum Sq Mean Sq F value Pr(>F)
## temp 2 0.58891 0.294454 30.517 8.446e-12 ***
## Residuals 146 1.40873 0.009649
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note the style of reporting the result - the statistic (\(F\)), degrees of freedom and \(p\) value are all provided in support.
Pay close attention to the sum of squares column. This model is good, but some will not be. The important ratio is called \(r^2\), a measure of explanatory power, and shows that, although a model can be very significant, it might not be very explanatory. We care about explanatory power or effect size, *not* \(p\) values.
The coefficients table for the model looks like this:
summary(lengthLM)
##
## Call:
## lm(formula = loglength ~ temp, data = mozwing)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.27446 -0.06218 0.02685 0.07266 0.14568
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.07379 0.01290 83.252 < 2e-16 ***
## temp26 -0.09129 0.01832 -4.983 1.75e-06 ***
## temp32 -0.16063 0.02122 -7.571 3.87e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09823 on 146 degrees of freedom
## (121 observations deleted due to missingness)
## Multiple R-squared: 0.2948, Adjusted R-squared: 0.2851
## F-statistic: 30.52 on 2 and 146 DF, p-value: 8.446e-12
It shows the following:
TukeyLength <- TukeyHSD(aov(lengthLM))
print(TukeyLength)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = lengthLM)
##
## $temp
## diff lwr upr p adj
## 26-22 -0.09129112 -0.1346712 -0.04791107 0.0000052
## 32-22 -0.16062514 -0.2108632 -0.11038706 0.0000000
## 32-26 -0.06933402 -0.1197347 -0.01893334 0.0039744
The table shows the following:
The differences between the three possible pairs and then the lower and upper bounds of the 95% confidence interval for the difference and a \(p\) value.
In each case, we want to know if the difference could be zero: does the 95% confidence interval for each pair include zero?
For all pairs, the confidence intervals do not include zero, so they are significantly different. For example, comparison between 32 and 22\(^\circ\)C, the interval does include zero (difference = -0.16, 95% CI’s limits are -0.21 & -0.11), so these groups are significantly different.
The \(p\) values for the top pairs are broadly consistent with the summary table. However, it may be harder to find significant results with a Tukey test in other cases.
You can visualise these confidence intervals by plotting the Tukey test. You have to tweak the graphics parameters to get a clean plot though.
options(repr.plot.res = 100, repr.plot.width = 10, repr.plot.height = 10)
par(mfrow=c(1,1))
par(las=1, mar=c(5,5,5,5))
# las= 1 turns labels horizontal
# mar makes the left margin wider (bottom, left, top, right)
plot(TukeyLength)
\(\star\) Include the Tukey test in your script for both the temperature and food models.
We’ve looked at two models, using temperature and food level. It is worth asking whether these are independent factors? This is important to know because otherwise, a two-way ANOVA would not be appropriate. We will look at interactions later.
OK, so we want to know whether the two factors are independent. This is a job for the \(\chi^2\) test!
The Chi-square test, also known as \(\chi^{2}\) test or chi-square test, is designed for scenarios where you want to statistically test how likely it is that an observed distribution of values is due to chance. It is also called a “goodness of fit” statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables of which measurements are made are independent. In our wing length example below, the two variables are temperature and food level.
Note that a \(\chi^{2}\) test is designed to analyze categorical data. That is the data have been counted (count data) and divided into categories. It is not meant for continuous data (such as body weight, genome size, or height). For example, if you want to test whether attending class influences how students perform on an exam, using test scores (from 0-100) as data would not be appropriate for a Chi-square test. However, arranging students into the categories “Pass” and “Fail” and counting up how many fall in each categories would be appropriate. Additionally, the data in a Chi-square table (see below) should not be in the form of percentages – only count data are allowed!
We can easily build a table for a Chi-square test on the wing length data as follows:
mozwing <- mozwing[!is.na(mozwing$loglength),] # subset data to omit cells for individuals that did not survive to adulthood
factorTable <- table(mozwing$food_level,mozwing$temp)
print(factorTable)
##
## 22 26 32
## 0.1 23 29 10
## 1 35 28 24
Now let’s run the test:
chisq.test(factorTable)
##
## Pearson's Chi-squared test
##
## data: factorTable
## X-squared = 4.1883, df = 2, p-value = 0.1232
The “X-squared” value is the \(\chi^{2}\) test statistic, akin to the t-value of the t-test or W value in the Wilcox test.
The \(\chi^{2}\) statistic is calculated as the sum of the quantity
\[\frac{(\mathrm{Observed} - \mathrm{Expected})^2}{\mathrm{Expected}}\]
across all the cells/categories in the table (so the sum would be over 6 categories in our current wing length example).
“Observed” is the observed proportion of data that fall in a certain category. For example, at 22oC there are 23 individuals observed in the food level, 0.1 category, and 35 in the food level, 1 category.
“Expected” is what count would be expected if the values in each category were truly independent. Each cell has its own expected value, which is simply calculated as the count one would expect in each category if the value were generated in proportion to the total number seen in that category. So in our example, the expected value for the food level, 0.1 category would be
\(23+35 \mathrm{~(Total~number~of~low~food~individuals)} \times \frac{23+29+10 \mathrm{~(Total~number~in~the~"0.1"~category)}}{23+35+29+28+10+24 \mathrm{~(Total~number~of~individuals)}} = 58 \times \frac{62}{87} = 17.19\)
The sum of all six (one for each cell in the table above) such calculations would be the \(\chi^{2}\) value that R gave you through the chisq.test() above — try it!
Now back to the R output from the chisq.test() above. Why df = 2? This is calculated as \(DF = (r - 1) * (c - 1)\) where \(r\) and \(c\) are the number of rows and columns in the \(\chi^{2}\) table, respectively. The same principle you learned before applies here; you lose one degree of freedom for each new level of information you need to estimate: there is uncertainity about the information (number of categories) in both rows and columns, so you need to lose one degree of freedom for each.
Finally, note that the p-value is non-significant — we can conclude that the factors are independent. This is expected because individuals could only be exposed to one of the six temperature-food treatments in the study’s experimental design. This dataset only contains females, however, the chisq.test() would be even more useful if the dataset contained values on both sexes. Later, we analyse these data using “interactions” later in multiple explanatory variables.
\(\star\) Include and run the \(\chi^2\) test in your script.
The last thing to do is to save a copy of the wing length data, including our new column of log data.
\(\star\) Use this code in your script to create the saved data in your Data directory :
save(mozwing, file='mozwing.Rdata')
In this section we will explore fitting a linear model to data when you have multiple explanatory (predictor) variables.
The aims of this section are\(^{[1]}\):
Learning to build and fit a linear model that includes several explanatory variables
Learning to interpret the summary tables and diagnostics after fitting a linear model with multiple explanatory variables
We will now look at a single model that includes both explanatory variables.
The first thing to do is look at the data again.
\(\star\) Create a new blank script called MulExpl.R in your Code directory and add some introductory comments.
wings <- read.csv("../data/traitdata_Huxleyetal_2021.csv")
Look back at the end of the previous section to see how you saved the RData file. If loglength is missing, just add it to the imported data frame again (go back to the ANOVA section and have a look if you have forgotten how). Also, check to see if temp and food_level are defined as factors.
Use ls(), and then str to check that the data has loaded correctly:
str(wings)
## 'data.frame': 270 obs. of 33 variables:
## $ ID : int 4 6 7 11 11 12 19 20 16 21 ...
## $ exp_no : int 3 3 3 3 3 3 3 3 3 3 ...
## $ rep : Factor w/ 3 levels "A","B","C": 1 1 1 1 3 1 1 1 3 1 ...
## $ dens : num 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
## $ rearing_vessel : Factor w/ 1 level "tub": 1 1 1 1 1 1 1 1 1 1 ...
## $ temp : int 22 22 22 22 22 22 22 22 22 22 ...
## $ food_level : num 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 ...
## $ treatment : Factor w/ 6 levels "22_0.1","22_1",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ feeding_interval : int 24 24 24 24 24 24 24 24 24 24 ...
## $ egg_sub : Factor w/ 2 levels "29/03/2019 12:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ l_emerg : Factor w/ 2 levels "30/03/2019 12:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ l_emerg_days_frm_sub : int 1 1 1 1 1 1 1 1 1 1 ...
## $ l_death : Factor w/ 29 levels "01/04/2019 12:00",..: NA NA NA NA NA NA NA NA NA NA ...
## $ l_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ pupal_ref : int 1 3 4 7 2 8 12 13 6 14 ...
## $ p_emerg : Factor w/ 27 levels "01/05/2019 12:00",..: 24 25 26 3 26 3 5 7 27 7 ...
## $ l_to_p_devtime : int 28 29 30 34 30 34 35 36 32 36 ...
## $ l_to_p_devrate : num 0.0357 0.0345 0.0333 0.0294 0.0333 ...
## $ p_death : Factor w/ 8 levels "03/04/2019 12:00",..: NA NA NA NA NA NA NA NA NA NA ...
## $ p_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ a_emerg : Factor w/ 30 levels "01/05/2019 12:00",..: 29 30 1 7 1 7 9 13 3 11 ...
## $ p_to_a_devtime : int 3 3 3 3 3 3 3 4 3 3 ...
## $ p_to_a_devrate : num 0.333 0.333 0.333 0.333 0.333 ...
## $ a_death : Factor w/ 39 levels "01/05/2019 12:00",..: 9 5 7 15 5 15 23 12 14 21 ...
## $ adult_lifespan : int 8 5 5 6 4 6 9 1 7 7 ...
## $ adult_mort_rate : num 0.125 0.2 0.2 0.167 0.25 ...
## $ hatch_to_a_devtime : int 31 32 33 37 33 37 38 40 35 39 ...
## $ hatch_to_adult_devrate: num 0.0323 0.0312 0.0303 0.027 0.0303 ...
## $ sex : Factor w/ 1 level "female": 1 1 1 1 1 1 1 1 1 1 ...
## $ im_surv : int 0 0 0 0 0 0 0 0 0 0 ...
## $ j_lifetime : int 31 32 33 37 33 37 38 40 35 39 ...
## $ total_lifespan : int 39 37 38 43 37 43 47 41 42 46 ...
## $ length_mm : num 2.93 2.59 2.41 2.62 2.28 2.39 2.99 2.5 2.75 2.82 ...
In the regression section, we asked if temperature and food level had meaningful effects on wing length. Now we want to ask questions like: How is the size-temperature relationship affected by different food levels?
We need to look at plots within groups.
Before we do that, there is a lot of data in the data frame that we don’t need for our analysis and we should make sure that we are using the same data for our plots and models. We will subset the data down to the complete data for the three variables and log the length of the wings:
wings <- wings %>%
mutate(loglength = log(length_mm))
wings <- subset(wings, select = c(temp, food_level, loglength))
wings <- na.omit(wings)
str(wings)
## 'data.frame': 149 obs. of 3 variables:
## $ temp : int 22 22 22 22 22 22 22 22 22 22 ...
## $ food_level: num 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 ...
## $ loglength : num 1.075 0.952 0.88 0.963 0.824 ...
## - attr(*, "na.action")= 'omit' Named int 38 42 60 103 136 142 143 157 158 159 ...
## ..- attr(*, "names")= chr "38" "42" "60" "103" ...
In the regression section, we used the subset option to fit a model just to dragonflies. You can use subset with plots too.
\(\star\) Add par(mfrow=c(1,2)) to your script to split the graphics into two panels.
\(\star\) Copy over and modify the code from the ANOVA section to create a boxplot of genome size by trophic level into your script.
\(\star\) Now further modify the code to generate the plots shown in the figure below (you will have to subset your data for this, and also use the subset option of the plot command).
You can use the `plot` function's option `main = ` to add titles to a plot.
lattice againRecall that the lattice package provides some very neat extra ways to plot data in groups. They look pretty but the downside is that they don’t use the same graphics system — all those par commands are useless for these graphs. The defaults look good though!
library(lattice)
bwplot(loglength ~ temp | food_level, data= wings)
The code loglength ~ temp | food_level means plot the relationship between wing length and temperature, but group within levels of food supply. We are using the function bwplot, which is provided by lattice to create box and whisker plots.
\(\star\) Create the lattice plots above from within your script.
Rearrange this code to have three plots, showing the box and whisker plots for food_level, grouped within the levels of temp.
Try reshaping the R plot window and running the command again. Lattice tries to make good use of the available space when creating lattice plots.
We’re going to make the barplot code from the regression section even more complicated! This time we want to know the mean log genome size within combinations of temp and food_level. We can still use tapply, providing more than one grouping factor. We create a set of grouping factors like this:
groups <- list(wings$temp, wings$food_level)
groupMeans <- tapply(wings$loglength, groups, FUN = mean)
print(groupMeans)
## 0.1 1
## 22 0.9720317 1.1406533
## 26 0.8937470 1.0744125
## 32 0.8216252 0.9513008
\(\star\) Copy this code into your script and run it.
Use this code and the script from the ANOVA section to get the set of standard errors for the groups groupSE:
seMean <- function(x){
# get rid of missing values
x <- na.omit(x)
# calculate the standard error
se <- sqrt(var(x)/length(x))
# tell the function to report the standard error
return(se)
}
groups <- list(wings$food_level, wings$temp)
groupMeans <- tapply(wings$loglength, groups, FUN=mean)
print(groupMeans)
## 22 26 32
## 0.1 0.9720317 0.893747 0.8216252
## 1 1.1406533 1.074412 0.9513008
groupSE <- tapply(wings$loglength, groups, FUN=seMean)
print(groupSE)
## 22 26 32
## 0.1 0.017990977 0.014147233 0.019430021
## 1 0.004732957 0.005898363 0.005579193
Now we can use barplot. The default option for a barplot of a table is to create a stacked barplot, which is not what we want. The option beside=TRUE makes the bars for each column appear side by side.
Once again, we save the midpoints of the bars to add the error bars. The other options in the code below change the colours of the bars and the length of error bar caps.
# get upper and lower standard error height
upperSE <- groupMeans + groupSE
lowerSE <- groupMeans - groupSE
# create barplot
barMids <- barplot(groupMeans, ylim=c(0, max(upperSE)), beside=TRUE, ylab= ' log (wing length (mm)) ' , col=c( ' white ' , ' grey70 '))
arrows(barMids, upperSE, barMids, lowerSE, ang=90, code=3, len=0.05)
\(\star\) Generate the barplot above and then edit your script to change the colours and error bar lengths to your taste.
We’ll use the plotmeans function again as an exercise to change graph settings and to prepare figures for reports and write ups. This is the figure you should be able to reproduce the figure below.
Replace image
\(\star\) Use plotmeans from the ANOVA section and the subset option to generate the two plots below. You will need to set the ylim option for the two plots to make them use the same \(y\) axis.
\(\star\) Use text to add labels — the command par('usr') will show you the limits of the plot (\(x_{min}, x_{max}, y_{min}, y_{max}\)) and help pick a location for the labels.
\(\star\) Change the par settings in your code and redraw the plots to try and make better use of the space. In the example below, the box shows the edges of the R graphics window.
Note the following about the the figure above (generated using plotmeans)):
White space: The default options in R use wide margins and spaced out axes and take up a lot of space that could be used for plotting data. You’ve already seen the par function and the options mfrow for multiple plots and mar to adjust margin size. The option mgp adjusts the placement of the axis label, tick labels and tick locations. See ?par for help on the these options.
Main titles: Adding large titles to graphs is also a bad idea — it uses lots of space to explain something that should be in the figure legend. With multiple plots in a figure, you have to label graphs so that the figure legend can refer to them. You can add labels using text(x,y,'label').
Figure legends: A figure caption and legend should give a clear stand-alone description of the whole figure.
Referring to figures: You must link from your text to your figures — a reader has to know which figures refer to which results. So: “There are clear differences in mean genome size between species at different trophic levels and between ground dwelling and other species, Figure xx”.
All those exploratory visualizations suggest:
Body size in Aedes aegytpi decreases with temperature, which is consistent with the size-temperature rule for many ectotherms.
Body size in Aedes aegytpi increases with larval food supply
We suspected these things from the ANOVA section analyses, but now we can see that they might have separate effects. We’ll fit a linear model to explore this and add the two explanatory variables together.
\(\star\) This is an important section — read it through carefully and ask questions if you are unsure. Copy the code into your script and add comments. Do not just jump to the next action item!
\(\star\) First, fit the model:
model <- lm(loglength ~ temp + food_level, data = mozwing)
We’re going to do things right this time and check the model diagnostics before we rush into interpretation.
library(repr) ; options(repr.plot.res = 100, repr.plot.width = 7, repr.plot.height = 8) # Change plot size
par(mfrow=c(2,2))
plot(model)
library(repr) ; options(repr.plot.res = 100, repr.plot.width = 6, repr.plot.height = 6) # Change plot size
Examine these diagnostic plots. There are six predicted values now - three temperature levels for each of the two food levels. Those plots look ok so now we can look at the analysis of variance table:
anova(model)
## Analysis of Variance Table
##
## Response: loglength
## Df Sum Sq Mean Sq F value Pr(>F)
## temp 2 0.58891 0.29445 96.404 < 2.2e-16 ***
## food_level 1 0.96584 0.96584 316.215 < 2.2e-16 ***
## Residuals 145 0.44288 0.00305
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Ignore the \(p\) values! Yes, they’re highly significant but we want to understand the model, not rubber stamp it with ‘significant’.
The sums of squares for the variables are both large compared to the residual sums of squares — there is lots of explained variation! We can calculate the \(r^2\) as explained sums of squares over total sums of squares:
\[\frac{0.59 + 0.97}{0.59 + 0.97 + 0.44} = \frac{3.56}{16.77} = 0.78\]
Food level explain more variation than temperature — this makes intuitive sense from the plots since there are big differences between in the figure we generated above (using plotmeans) (a vs b), but smaller differences within.
We could also calculate a significance for the whole model by merging the terms. The total explained sums of squares of \(0.59 + 0.97 = 1.56\) uses \(2+1 =3\) degrees of freedom, so the mean sums of squares for all the terms together is \(1.56/3=0.52\). Dividing this by the residual mean square of 0.003 gives an F of \(0.52 / 0.003 = 173.33\).
Now we can look at the summary table to see the coefficients:
summary(model)
##
## Call:
## lm(formula = loglength ~ temp + food_level, data = mozwing)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.193070 -0.024347 0.004734 0.032994 0.121469
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.973804 0.009180 106.078 < 2e-16 ***
## temp26 -0.072698 0.010361 -7.017 8.05e-11 ***
## temp32 -0.177597 0.011975 -14.830 < 2e-16 ***
## food_level1 0.165684 0.009317 17.782 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05527 on 145 degrees of freedom
## Multiple R-squared: 0.7783, Adjusted R-squared: 0.7737
## F-statistic: 169.7 on 3 and 145 DF, p-value: < 2.2e-16
Starting at the bottom of this output, summary has again calculated \(r^2\) for us and also an \(F\) statistic for the whole model, which matches the calculation above.
The other important bits are the four coefficients. The intercept is now the reference level for two variables: it is the mean for individuals that were reared at 22oC with low food supply (0.1 mg day^-1). We then have differences from this value for individuals reared at the other temperatures. There is a big change in wing length associated with increased temperature and food level and both of these have large effects sizes. Increasing temperature from 22oC to 32oC introduced about a 20% difference in wing length. Increased food at 22oC increased wing by ~20%.
Because the difference is large and the standard error is small, the \(t\) value suggests that this difference did not arise just by chance. Put another way, it is significant.
The table below shows how these four coefficients combine to give the predicted values for each of the group means.
| 22oC | 26oC | 32oC | |
|---|---|---|---|
| low food | 0.98 = 0.98 | 0.98 - 0.07 = 0.91 | 0.98 - 0.18 = 0.70 |
| high food | 0.98 + 0.17 = 1.15 | 0.98 - 0.07 + 0.17 =1.08 | 0.98 - 0.18 + 0.17 = 0.97 |
Here you will build on your skills in fitting linear models with multiple explanatory variables to data. You will learn about another commonly used Linear Model fitting technique: ANCOVA.
We will build two models in this section:
Model 1: Is Ae. aegypti wing length predicted by interactions between temperature and food level?
ANCOVA: Is body size in Odonata predicted by interactions between genome size and taxonomic suborder?
So far, we have only looked at the independent effects of variables. For example, in the temperature and food level model from the first multiple explanatory variables section, we only looked for specific differences for being exposed to low larval food supply or high larval food supply, not for being specifically reared at a specific temperature-food level treatment. These independent effects of a variable are known as main effects and the effects of combinations of variables acting together are known as interactions — they describe how the variables interact.
The aims of this section are\(^{[1]}\):
Creating more complex Linear Models with multiple explanatory variables
Including the effects of interactions between multiple variables in a linear model
Plotting predictions from more complex (multiple explanatory variables) linear models
We’ve already seen a number of different model formulae in R. They all use this syntax:
response variable ~ explanatory variable(s)
But we are now going to see two extra pieces of syntax:
y ~ a + b + a:b: The a:b means the interaction between a and b — do combinations of these variables lead to different outcomes?
y ~ a * b: This a shorthand for the model above. The means fit a and b as main effects and their interaction a:b.
\(\star\) Make sure you have changed the working directory to Code in your stats coursework directory.
\(\star\) Create a new blank script called ‘Interactions.R’ and add some introductory comments.
\(\star\) Read in the data:
read.csv('../data/traitdata_Huxleyetal_2021.csv',stringsAsFactors = TRUE)
## ID exp_no rep dens rearing_vessel temp food_level treatment
## 1 4 3 A 0.2 tub 22 0.1 22_0.1
## 2 6 3 A 0.2 tub 22 0.1 22_0.1
## 3 7 3 A 0.2 tub 22 0.1 22_0.1
## 4 11 3 A 0.2 tub 22 0.1 22_0.1
## 5 11 3 C 0.2 tub 22 0.1 22_0.1
## 6 12 3 A 0.2 tub 22 0.1 22_0.1
## 7 19 3 A 0.2 tub 22 0.1 22_0.1
## 8 20 3 A 0.2 tub 22 0.1 22_0.1
## 9 16 3 C 0.2 tub 22 0.1 22_0.1
## 10 21 3 A 0.2 tub 22 0.1 22_0.1
## 11 17 3 B 0.2 tub 22 0.1 22_0.1
## 12 18 3 B 0.2 tub 22 0.1 22_0.1
## 13 18 3 C 0.2 tub 22 0.1 22_0.1
## 14 27 3 B 0.2 tub 22 0.1 22_0.1
## 15 28 3 B 0.2 tub 22 0.1 22_0.1
## 16 19 3 B 0.2 tub 22 0.1 22_0.1
## 17 19 3 C 0.2 tub 22 0.1 22_0.1
## 18 20 3 C 0.2 tub 22 0.1 22_0.1
## 19 21 3 C 0.2 tub 22 0.1 22_0.1
## 20 22 3 C 0.2 tub 22 0.1 22_0.1
## 21 25 3 B 0.2 tub 22 0.1 22_0.1
## 22 26 3 B 0.2 tub 22 0.1 22_0.1
## 23 23 3 C 0.2 tub 22 0.1 22_0.1
## 24 9 3 A 0.2 tub 26 1.0 26_1
## 25 9 3 A 0.2 tub 22 1.0 22_1
## 26 10 3 A 0.2 tub 22 1.0 22_1
## 27 10 3 A 0.2 tub 26 1.0 26_1
## 28 11 3 A 0.2 tub 26 1.0 26_1
## 29 12 3 A 0.2 tub 26 1.0 26_1
## 30 19 3 A 0.2 tub 22 1.0 22_1
## 31 20 3 A 0.2 tub 22 1.0 22_1
## 32 3 3 B 0.2 tub 22 1.0 22_1
## 33 4 3 B 0.2 tub 22 1.0 22_1
## 34 6 3 B 0.2 tub 22 1.0 22_1
## 35 12 3 B 0.2 tub 22 1.0 22_1
## 36 13 3 B 0.2 tub 22 1.0 22_1
## 37 16 3 B 0.2 tub 22 1.0 22_1
## 38 18 3 B 0.2 tub 22 1.0 22_1
## 39 20 3 B 0.2 tub 22 1.0 22_1
## 40 22 3 B 0.2 tub 22 1.0 22_1
## 41 23 3 B 0.2 tub 22 1.0 22_1
## 42 25 3 B 0.2 tub 22 1.0 22_1
## 43 26 3 B 0.2 tub 22 1.0 22_1
## 44 27 3 B 0.2 tub 22 1.0 22_1
## 45 28 3 B 0.2 tub 22 1.0 22_1
## 46 8 3 C 0.2 tub 22 1.0 22_1
## 47 14 3 C 0.2 tub 22 1.0 22_1
## 48 15 3 C 0.2 tub 22 1.0 22_1
## 49 16 3 C 0.2 tub 22 1.0 22_1
## 50 17 3 C 0.2 tub 22 1.0 22_1
## 51 19 3 C 0.2 tub 22 1.0 22_1
## 52 22 3 C 0.2 tub 22 1.0 22_1
## 53 23 3 C 0.2 tub 22 1.0 22_1
## 54 24 3 C 0.2 tub 22 1.0 22_1
## 55 25 3 C 0.2 tub 22 1.0 22_1
## 56 29 3 C 0.2 tub 22 1.0 22_1
## 57 7 3 A 0.2 tub 26 0.1 26_0.1
## 58 8 3 A 0.2 tub 26 0.1 26_0.1
## 59 15 3 A 0.2 tub 26 0.1 26_0.1
## 60 13 3 A 0.2 tub 26 1.0 26_1
## 61 16 3 A 0.2 tub 26 0.1 26_0.1
## 62 17 3 A 0.2 tub 26 0.1 26_0.1
## 63 26 3 A 0.2 tub 26 0.1 26_0.1
## 64 27 3 A 0.2 tub 26 0.1 26_0.1
## 65 23 3 A 0.2 tub 26 0.1 26_0.1
## 66 14 3 A 0.2 tub 26 0.1 26_0.1
## 67 29 3 A 0.2 tub 26 0.1 26_0.1
## 68 12 3 B 0.2 tub 26 0.1 26_0.1
## 69 13 3 B 0.2 tub 26 0.1 26_0.1
## 70 17 3 B 0.2 tub 26 0.1 26_0.1
## 71 18 3 B 0.2 tub 26 0.1 26_0.1
## 72 19 3 B 0.2 tub 26 0.1 26_0.1
## 73 20 3 B 0.2 tub 26 0.1 26_0.1
## 74 22 3 B 0.2 tub 26 0.1 26_0.1
## 75 24 3 B 0.2 tub 26 0.1 26_0.1
## 76 5 3 B 0.2 tub 26 0.1 26_0.1
## 77 3 3 C 0.2 tub 26 0.1 26_0.1
## 78 4 3 C 0.2 tub 26 0.1 26_0.1
## 79 9 3 C 0.2 tub 26 0.1 26_0.1
## 80 12 3 C 0.2 tub 26 0.1 26_0.1
## 81 30 3 C 0.2 tub 26 0.1 26_0.1
## 82 18 3 C 0.2 tub 26 0.1 26_0.1
## 83 15 3 A 0.2 tub 26 1.0 26_1
## 84 16 3 A 0.2 tub 26 1.0 26_1
## 85 19 3 A 0.2 tub 26 1.0 26_1
## 86 21 3 A 0.2 tub 22 1.0 22_1
## 87 22 3 A 0.2 tub 22 1.0 22_1
## 88 9 3 B 0.2 tub 26 1.0 26_1
## 89 12 3 B 0.2 tub 26 1.0 26_1
## 90 14 3 B 0.2 tub 26 1.0 26_1
## 91 21 3 B 0.2 tub 26 1.0 26_1
## 92 23 3 B 0.2 tub 26 1.0 26_1
## 93 26 3 B 0.2 tub 26 1.0 26_1
## 94 30 3 B 0.2 tub 26 1.0 26_1
## 95 8 3 C 0.2 tub 26 1.0 26_1
## 96 9 3 C 0.2 tub 26 1.0 26_1
## 97 10 3 C 0.2 tub 26 1.0 26_1
## 98 11 3 C 0.2 tub 26 1.0 26_1
## 99 14 3 C 0.2 tub 26 1.0 26_1
## 100 21 3 C 0.2 tub 26 1.0 26_1
## 101 22 3 C 0.2 tub 26 1.0 26_1
## 102 23 3 C 0.2 tub 26 1.0 26_1
## 103 24 3 C 0.2 tub 26 1.0 26_1
## 104 25 3 C 0.2 tub 26 1.0 26_1
## 105 26 3 C 0.2 tub 26 1.0 26_1
## 106 21 3 A 0.2 tub 32 0.1 32_0.1
## 107 23 3 A 0.2 tub 22 1.0 22_1
## 108 26 3 A 0.2 tub 32 0.1 32_0.1
## 109 22 3 B 0.2 tub 32 0.1 32_0.1
## 110 26 3 B 0.2 tub 32 0.1 32_0.1
## 111 30 3 B 0.2 tub 32 0.1 32_0.1
## 112 7 3 C 0.2 tub 32 0.1 32_0.1
## 113 10 3 C 0.2 tub 32 0.1 32_0.1
## 114 24 3 C 0.2 tub 32 0.1 32_0.1
## 115 25 3 C 0.2 tub 32 0.1 32_0.1
## 116 30 3 A 0.2 tub 32 0.1 32_0.1
## 117 24 3 A 0.2 tub 22 1.0 22_1
## 118 10 3 A 0.2 tub 32 1.0 32_1
## 119 14 3 A 0.2 tub 32 1.0 32_1
## 120 23 3 A 0.2 tub 26 1.0 26_1
## 121 18 3 A 0.2 tub 32 1.0 32_1
## 122 19 3 A 0.2 tub 32 1.0 32_1
## 123 25 3 A 0.2 tub 22 1.0 22_1
## 124 20 3 A 0.2 tub 32 1.0 32_1
## 125 24 3 A 0.2 tub 26 1.0 26_1
## 126 26 3 A 0.2 tub 22 1.0 22_1
## 127 23 3 A 0.2 tub 32 1.0 32_1
## 128 24 3 A 0.2 tub 32 1.0 32_1
## 129 27 3 A 0.2 tub 22 1.0 22_1
## 130 27 3 A 0.2 tub 32 1.0 32_1
## 131 29 3 A 0.2 tub 32 1.0 32_1
## 132 27 3 A 0.2 tub 26 1.0 26_1
## 133 10 3 B 0.2 tub 32 1.0 32_1
## 134 11 3 B 0.2 tub 32 1.0 32_1
## 135 12 3 B 0.2 tub 32 1.0 32_1
## 136 13 3 B 0.2 tub 32 1.0 32_1
## 137 15 3 B 0.2 tub 32 1.0 32_1
## 138 16 3 B 0.2 tub 32 1.0 32_1
## 139 17 3 B 0.2 tub 32 1.0 32_1
## 140 20 3 B 0.2 tub 32 1.0 32_1
## 141 21 3 B 0.2 tub 32 1.0 32_1
## 142 22 3 B 0.2 tub 32 1.0 32_1
## 143 23 3 B 0.2 tub 32 1.0 32_1
## 144 24 3 B 0.2 tub 32 1.0 32_1
## 145 30 3 B 0.2 tub 32 1.0 32_1
## 146 15 3 C 0.2 tub 32 1.0 32_1
## 147 16 3 C 0.2 tub 32 1.0 32_1
## 148 19 3 C 0.2 tub 32 1.0 32_1
## 149 23 3 C 0.2 tub 32 1.0 32_1
## 150 28 3 C 0.2 tub 32 1.0 32_1
## 151 9 3 B 0.2 tub 26 0.1 26_0.1
## 152 30 3 B 0.2 tub 26 0.1 26_0.1
## 153 28 3 A 0.2 tub 26 1.0 26_1
## 154 30 3 A 0.2 tub 22 1.0 22_1
## 155 26 3 C 0.2 tub 26 0.1 26_0.1
## 156 5 3 C 0.2 tub 26 0.1 26_0.1
## 157 24 3 C 0.2 tub 22 0.1 22_0.1
## 158 10 3 B 0.2 tub 22 0.1 22_0.1
## 159 27 3 C 0.2 tub 22 0.1 22_0.1
## 160 29 3 C 0.2 tub 22 0.1 22_0.1
## 161 3 3 A 0.2 tub 22 0.1 22_0.1
## 162 14 3 B 0.2 tub 22 0.1 22_0.1
## 163 30 3 A 0.2 tub 22 0.1 22_0.1
## 164 30 3 C 0.2 tub 22 0.1 22_0.1
## 165 29 3 A 0.2 tub 22 0.1 22_0.1
## 166 25 3 C 0.2 tub 22 0.1 22_0.1
## 167 8 3 B 0.2 tub 22 0.1 22_0.1
## 168 30 3 B 0.2 tub 22 0.1 22_0.1
## 169 13 3 B 0.2 tub 22 0.1 22_0.1
## 170 10 3 A 0.2 tub 22 0.1 22_0.1
## 171 6 3 C 0.2 tub 22 0.1 22_0.1
## 172 23 3 A 0.2 tub 22 0.1 22_0.1
## 173 2 3 A 0.2 tub 22 0.1 22_0.1
## 174 1 3 B 0.2 tub 22 0.1 22_0.1
## 175 5 3 C 0.2 tub 22 0.1 22_0.1
## 176 2 3 C 0.2 tub 22 0.1 22_0.1
## 177 12 3 B 0.2 tub 22 0.1 22_0.1
## 178 28 3 A 0.2 tub 22 0.1 22_0.1
## 179 30 3 C 0.2 tub 22 1.0 22_1
## 180 2 3 C 0.2 tub 22 1.0 22_1
## 181 27 3 C 0.2 tub 22 1.0 22_1
## 182 24 3 B 0.2 tub 22 1.0 22_1
## 183 20 3 C 0.2 tub 22 1.0 22_1
## 184 2 3 B 0.2 tub 22 1.0 22_1
## 185 3 3 A 0.2 tub 22 1.0 22_1
## 186 3 3 C 0.2 tub 22 1.0 22_1
## 187 27 3 B 0.2 tub 26 0.1 26_0.1
## 188 15 3 C 0.2 tub 26 0.1 26_0.1
## 189 21 3 C 0.2 tub 26 0.1 26_0.1
## 190 2 3 A 0.2 tub 26 0.1 26_0.1
## 191 11 3 B 0.2 tub 26 0.1 26_0.1
## 192 28 3 C 0.2 tub 26 0.1 26_0.1
## 193 19 3 A 0.2 tub 26 0.1 26_0.1
## 194 24 3 A 0.2 tub 26 0.1 26_0.1
## 195 25 3 C 0.2 tub 26 0.1 26_0.1
## 196 6 3 A 0.2 tub 26 0.1 26_0.1
## 197 1 3 B 0.2 tub 26 0.1 26_0.1
## 198 25 3 B 0.2 tub 26 0.1 26_0.1
## 199 25 3 A 0.2 tub 26 0.1 26_0.1
## 200 23 3 C 0.2 tub 26 0.1 26_0.1
## 201 20 3 C 0.2 tub 26 0.1 26_0.1
## 202 13 3 A 0.2 tub 26 0.1 26_0.1
## 203 3 3 B 0.2 tub 26 1.0 26_1
## 204 25 3 A 0.2 tub 26 1.0 26_1
## 205 12 3 C 0.2 tub 26 1.0 26_1
## 206 1 3 A 0.2 tub 26 1.0 26_1
## 207 1 3 B 0.2 tub 26 1.0 26_1
## 208 1 3 C 0.2 tub 26 1.0 26_1
## 209 2 3 A 0.2 tub 26 1.0 26_1
## 210 27 3 C 0.2 tub 26 1.0 26_1
## 211 3 3 A 0.2 tub 26 1.0 26_1
## 212 3 3 C 0.2 tub 26 1.0 26_1
## 213 2 3 C 0.2 tub 26 1.0 26_1
## 214 2 3 B 0.2 tub 26 1.0 26_1
## 215 4 3 C 0.2 tub 26 1.0 26_1
## 216 17 3 A 0.2 tub 26 1.0 26_1
## 217 18 3 C 0.2 tub 26 1.0 26_1
## 218 4 3 B 0.2 tub 32 0.1 32_0.1
## 219 20 3 A 0.2 tub 32 0.1 32_0.1
## 220 9 3 C 0.2 tub 32 0.1 32_0.1
## 221 7 3 B 0.2 tub 32 0.1 32_0.1
## 222 6 3 C 0.2 tub 32 0.1 32_0.1
## 223 5 3 C 0.2 tub 32 0.1 32_0.1
## 224 4 3 C 0.2 tub 32 0.1 32_0.1
## 225 5 3 B 0.2 tub 32 0.1 32_0.1
## 226 28 3 B 0.2 tub 32 0.1 32_0.1
## 227 16 3 B 0.2 tub 32 0.1 32_0.1
## 228 20 3 B 0.2 tub 32 0.1 32_0.1
## 229 1 3 A 0.2 tub 32 0.1 32_0.1
## 230 17 3 C 0.2 tub 32 0.1 32_0.1
## 231 14 3 A 0.2 tub 32 0.1 32_0.1
## 232 24 3 A 0.2 tub 32 0.1 32_0.1
## 233 11 3 B 0.2 tub 32 0.1 32_0.1
## 234 29 3 C 0.2 tub 32 0.1 32_0.1
## 235 6 3 A 0.2 tub 32 0.1 32_0.1
## 236 11 3 C 0.2 tub 32 0.1 32_0.1
## 237 8 3 C 0.2 tub 32 0.1 32_0.1
## 238 27 3 C 0.2 tub 32 0.1 32_0.1
## 239 15 3 C 0.2 tub 32 0.1 32_0.1
## 240 5 3 A 0.2 tub 32 0.1 32_0.1
## 241 2 3 C 0.2 tub 32 0.1 32_0.1
## 242 24 3 B 0.2 tub 32 0.1 32_0.1
## 243 18 3 B 0.2 tub 32 0.1 32_0.1
## 244 17 3 B 0.2 tub 32 0.1 32_0.1
## 245 8 3 A 0.2 tub 32 0.1 32_0.1
## 246 19 3 C 0.2 tub 32 0.1 32_0.1
## 247 30 3 C 0.2 tub 32 0.1 32_0.1
## 248 15 3 A 0.2 tub 32 0.1 32_0.1
## 249 2 3 B 0.2 tub 32 0.1 32_0.1
## 250 13 3 A 0.2 tub 32 0.1 32_0.1
## 251 7 3 A 0.2 tub 32 0.1 32_0.1
## 252 12 3 C 0.2 tub 32 0.1 32_0.1
## 253 3 3 A 0.2 tub 32 1.0 32_1
## 254 6 3 B 0.2 tub 32 1.0 32_1
## 255 11 3 C 0.2 tub 32 1.0 32_1
## 256 8 3 B 0.2 tub 32 1.0 32_1
## 257 28 3 A 0.2 tub 32 1.0 32_1
## 258 4 3 B 0.2 tub 32 1.0 32_1
## 259 5 3 C 0.2 tub 32 1.0 32_1
## 260 3 3 B 0.2 tub 32 1.0 32_1
## 261 2 3 C 0.2 tub 32 1.0 32_1
## 262 4 3 C 0.2 tub 32 1.0 32_1
## 263 7 3 C 0.2 tub 32 1.0 32_1
## 264 8 3 C 0.2 tub 32 1.0 32_1
## 265 12 3 C 0.2 tub 32 1.0 32_1
## 266 9 3 C 0.2 tub 32 1.0 32_1
## 267 6 3 A 0.2 tub 32 1.0 32_1
## 268 1 3 A 0.2 tub 32 1.0 32_1
## 269 10 3 C 0.2 tub 32 1.0 32_1
## 270 2 3 B 0.2 tub 32 1.0 32_1
## feeding_interval egg_sub l_emerg l_emerg_days_frm_sub
## 1 24 29/03/2019 12:00 30/03/2019 12:00 1
## 2 24 29/03/2019 12:00 30/03/2019 12:00 1
## 3 24 29/03/2019 12:00 30/03/2019 12:00 1
## 4 24 29/03/2019 12:00 30/03/2019 12:00 1
## 5 24 29/03/2019 12:00 30/03/2019 12:00 1
## 6 24 29/03/2019 12:00 30/03/2019 12:00 1
## 7 24 29/03/2019 12:00 30/03/2019 12:00 1
## 8 24 29/03/2019 12:00 30/03/2019 12:00 1
## 9 24 29/03/2019 12:00 30/03/2019 12:00 1
## 10 24 29/03/2019 12:00 30/03/2019 12:00 1
## 11 24 29/03/2019 12:00 30/03/2019 12:00 1
## 12 24 29/03/2019 12:00 30/03/2019 12:00 1
## 13 24 29/03/2019 12:00 30/03/2019 12:00 1
## 14 24 29/03/2019 12:00 30/03/2019 12:00 1
## 15 24 29/03/2019 12:00 30/03/2019 12:00 1
## 16 24 29/03/2019 12:00 30/03/2019 12:00 1
## 17 24 29/03/2019 12:00 30/03/2019 12:00 1
## 18 24 29/03/2019 12:00 30/03/2019 12:00 1
## 19 24 29/03/2019 12:00 30/03/2019 12:00 1
## 20 24 29/03/2019 12:00 30/03/2019 12:00 1
## 21 24 29/03/2019 12:00 30/03/2019 12:00 1
## 22 24 29/03/2019 12:00 30/03/2019 12:00 1
## 23 24 29/03/2019 12:00 30/03/2019 12:00 1
## 24 24 29/03/2019 12:00 30/03/2019 12:00 1
## 25 24 29/03/2019 12:00 30/03/2019 12:00 1
## 26 24 29/03/2019 12:00 30/03/2019 12:00 1
## 27 24 29/03/2019 12:00 30/03/2019 12:00 1
## 28 24 29/03/2019 12:00 30/03/2019 12:00 1
## 29 24 29/03/2019 12:00 30/03/2019 12:00 1
## 30 24 29/03/2019 12:00 30/03/2019 12:00 1
## 31 24 29/03/2019 12:00 30/03/2019 12:00 1
## 32 24 29/03/2019 12:00 30/03/2019 12:00 1
## 33 24 29/03/2019 12:00 30/03/2019 12:00 1
## 34 24 29/03/2019 12:00 30/03/2019 12:00 1
## 35 24 29/03/2019 12:00 30/03/2019 12:00 1
## 36 24 29/03/2019 12:00 30/03/2019 12:00 1
## 37 24 29/03/2019 12:00 30/03/2019 12:00 1
## 38 24 29/03/2019 12:00 30/03/2019 12:00 1
## 39 24 29/03/2019 12:00 30/03/2019 12:00 1
## 40 24 29/03/2019 12:00 30/03/2019 12:00 1
## 41 24 29/03/2019 12:00 30/03/2019 12:00 1
## 42 24 29/03/2019 12:00 30/03/2019 12:00 1
## 43 24 29/03/2019 12:00 30/03/2019 12:00 1
## 44 24 29/03/2019 12:00 30/03/2019 12:00 1
## 45 24 29/03/2019 12:00 30/03/2019 12:00 1
## 46 24 29/03/2019 12:00 30/03/2019 12:00 1
## 47 24 29/03/2019 12:00 30/03/2019 12:00 1
## 48 24 29/03/2019 12:00 30/03/2019 12:00 1
## 49 24 29/03/2019 12:00 30/03/2019 12:00 1
## 50 24 29/03/2019 12:00 30/03/2019 12:00 1
## 51 24 29/03/2019 12:00 30/03/2019 12:00 1
## 52 24 29/03/2019 12:00 30/03/2019 12:00 1
## 53 24 29/03/2019 12:00 30/03/2019 12:00 1
## 54 24 29/03/2019 12:00 30/03/2019 12:00 1
## 55 24 29/03/2019 12:00 30/03/2019 12:00 1
## 56 24 29/03/2019 12:00 30/03/2019 12:00 1
## 57 24 30/03/2019 12:00 31/03/2019 12:00 1
## 58 24 30/03/2019 12:00 31/03/2019 12:00 1
## 59 24 30/03/2019 12:00 31/03/2019 12:00 1
## 60 24 29/03/2019 12:00 30/03/2019 12:00 1
## 61 24 30/03/2019 12:00 31/03/2019 12:00 1
## 62 24 30/03/2019 12:00 31/03/2019 12:00 1
## 63 24 30/03/2019 12:00 31/03/2019 12:00 1
## 64 24 30/03/2019 12:00 31/03/2019 12:00 1
## 65 24 30/03/2019 12:00 31/03/2019 12:00 1
## 66 24 30/03/2019 12:00 31/03/2019 12:00 1
## 67 24 30/03/2019 12:00 31/03/2019 12:00 1
## 68 24 30/03/2019 12:00 31/03/2019 12:00 1
## 69 24 30/03/2019 12:00 31/03/2019 12:00 1
## 70 24 30/03/2019 12:00 31/03/2019 12:00 1
## 71 24 30/03/2019 12:00 31/03/2019 12:00 1
## 72 24 30/03/2019 12:00 31/03/2019 12:00 1
## 73 24 30/03/2019 12:00 31/03/2019 12:00 1
## 74 24 30/03/2019 12:00 31/03/2019 12:00 1
## 75 24 30/03/2019 12:00 31/03/2019 12:00 1
## 76 24 30/03/2019 12:00 31/03/2019 12:00 1
## 77 24 30/03/2019 12:00 31/03/2019 12:00 1
## 78 24 30/03/2019 12:00 31/03/2019 12:00 1
## 79 24 30/03/2019 12:00 31/03/2019 12:00 1
## 80 24 30/03/2019 12:00 31/03/2019 12:00 1
## 81 24 30/03/2019 12:00 31/03/2019 12:00 1
## 82 24 30/03/2019 12:00 31/03/2019 12:00 1
## 83 24 29/03/2019 12:00 30/03/2019 12:00 1
## 84 24 29/03/2019 12:00 30/03/2019 12:00 1
## 85 24 29/03/2019 12:00 30/03/2019 12:00 1
## 86 24 29/03/2019 12:00 30/03/2019 12:00 1
## 87 24 29/03/2019 12:00 30/03/2019 12:00 1
## 88 24 29/03/2019 12:00 30/03/2019 12:00 1
## 89 24 29/03/2019 12:00 30/03/2019 12:00 1
## 90 24 29/03/2019 12:00 30/03/2019 12:00 1
## 91 24 29/03/2019 12:00 30/03/2019 12:00 1
## 92 24 29/03/2019 12:00 30/03/2019 12:00 1
## 93 24 29/03/2019 12:00 30/03/2019 12:00 1
## 94 24 29/03/2019 12:00 30/03/2019 12:00 1
## 95 24 29/03/2019 12:00 30/03/2019 12:00 1
## 96 24 29/03/2019 12:00 30/03/2019 12:00 1
## 97 24 29/03/2019 12:00 30/03/2019 12:00 1
## 98 24 29/03/2019 12:00 30/03/2019 12:00 1
## 99 24 29/03/2019 12:00 30/03/2019 12:00 1
## 100 24 29/03/2019 12:00 30/03/2019 12:00 1
## 101 24 29/03/2019 12:00 30/03/2019 12:00 1
## 102 24 29/03/2019 12:00 30/03/2019 12:00 1
## 103 24 29/03/2019 12:00 30/03/2019 12:00 1
## 104 24 29/03/2019 12:00 30/03/2019 12:00 1
## 105 24 29/03/2019 12:00 30/03/2019 12:00 1
## 106 24 30/03/2019 12:00 31/03/2019 12:00 1
## 107 24 29/03/2019 12:00 30/03/2019 12:00 1
## 108 24 30/03/2019 12:00 31/03/2019 12:00 1
## 109 24 30/03/2019 12:00 31/03/2019 12:00 1
## 110 24 30/03/2019 12:00 31/03/2019 12:00 1
## 111 24 30/03/2019 12:00 31/03/2019 12:00 1
## 112 24 30/03/2019 12:00 31/03/2019 12:00 1
## 113 24 30/03/2019 12:00 31/03/2019 12:00 1
## 114 24 30/03/2019 12:00 31/03/2019 12:00 1
## 115 24 30/03/2019 12:00 31/03/2019 12:00 1
## 116 24 30/03/2019 12:00 31/03/2019 12:00 1
## 117 24 29/03/2019 12:00 30/03/2019 12:00 1
## 118 24 30/03/2019 12:00 31/03/2019 12:00 1
## 119 24 30/03/2019 12:00 31/03/2019 12:00 1
## 120 24 29/03/2019 12:00 30/03/2019 12:00 1
## 121 24 30/03/2019 12:00 31/03/2019 12:00 1
## 122 24 30/03/2019 12:00 31/03/2019 12:00 1
## 123 24 29/03/2019 12:00 30/03/2019 12:00 1
## 124 24 30/03/2019 12:00 31/03/2019 12:00 1
## 125 24 29/03/2019 12:00 30/03/2019 12:00 1
## 126 24 29/03/2019 12:00 30/03/2019 12:00 1
## 127 24 30/03/2019 12:00 31/03/2019 12:00 1
## 128 24 30/03/2019 12:00 31/03/2019 12:00 1
## 129 24 29/03/2019 12:00 30/03/2019 12:00 1
## 130 24 30/03/2019 12:00 31/03/2019 12:00 1
## 131 24 30/03/2019 12:00 31/03/2019 12:00 1
## 132 24 29/03/2019 12:00 30/03/2019 12:00 1
## 133 24 30/03/2019 12:00 31/03/2019 12:00 1
## 134 24 30/03/2019 12:00 31/03/2019 12:00 1
## 135 24 30/03/2019 12:00 31/03/2019 12:00 1
## 136 24 30/03/2019 12:00 31/03/2019 12:00 1
## 137 24 30/03/2019 12:00 31/03/2019 12:00 1
## 138 24 30/03/2019 12:00 31/03/2019 12:00 1
## 139 24 30/03/2019 12:00 31/03/2019 12:00 1
## 140 24 30/03/2019 12:00 31/03/2019 12:00 1
## 141 24 30/03/2019 12:00 31/03/2019 12:00 1
## 142 24 30/03/2019 12:00 31/03/2019 12:00 1
## 143 24 30/03/2019 12:00 31/03/2019 12:00 1
## 144 24 30/03/2019 12:00 31/03/2019 12:00 1
## 145 24 30/03/2019 12:00 31/03/2019 12:00 1
## 146 24 30/03/2019 12:00 31/03/2019 12:00 1
## 147 24 30/03/2019 12:00 31/03/2019 12:00 1
## 148 24 30/03/2019 12:00 31/03/2019 12:00 1
## 149 24 30/03/2019 12:00 31/03/2019 12:00 1
## 150 24 30/03/2019 12:00 31/03/2019 12:00 1
## 151 24 30/03/2019 12:00 31/03/2019 12:00 1
## 152 24 30/03/2019 12:00 31/03/2019 12:00 1
## 153 24 29/03/2019 12:00 30/03/2019 12:00 1
## 154 24 29/03/2019 12:00 30/03/2019 12:00 1
## 155 24 30/03/2019 12:00 31/03/2019 12:00 1
## 156 24 30/03/2019 12:00 31/03/2019 12:00 1
## 157 24 29/03/2019 12:00 30/03/2019 12:00 1
## 158 24 29/03/2019 12:00 30/03/2019 12:00 1
## 159 24 29/03/2019 12:00 30/03/2019 12:00 1
## 160 24 29/03/2019 12:00 30/03/2019 12:00 1
## 161 24 29/03/2019 12:00 30/03/2019 12:00 1
## 162 24 29/03/2019 12:00 30/03/2019 12:00 1
## 163 24 29/03/2019 12:00 30/03/2019 12:00 1
## 164 24 29/03/2019 12:00 30/03/2019 12:00 1
## 165 24 29/03/2019 12:00 30/03/2019 12:00 1
## 166 24 29/03/2019 12:00 30/03/2019 12:00 1
## 167 24 29/03/2019 12:00 30/03/2019 12:00 1
## 168 24 29/03/2019 12:00 30/03/2019 12:00 1
## 169 24 29/03/2019 12:00 30/03/2019 12:00 1
## 170 24 29/03/2019 12:00 30/03/2019 12:00 1
## 171 24 29/03/2019 12:00 30/03/2019 12:00 1
## 172 24 29/03/2019 12:00 30/03/2019 12:00 1
## 173 24 29/03/2019 12:00 30/03/2019 12:00 1
## 174 24 29/03/2019 12:00 30/03/2019 12:00 1
## 175 24 29/03/2019 12:00 30/03/2019 12:00 1
## 176 24 29/03/2019 12:00 30/03/2019 12:00 1
## 177 24 29/03/2019 12:00 30/03/2019 12:00 1
## 178 24 29/03/2019 12:00 30/03/2019 12:00 1
## 179 24 29/03/2019 12:00 30/03/2019 12:00 1
## 180 24 29/03/2019 12:00 30/03/2019 12:00 1
## 181 24 29/03/2019 12:00 30/03/2019 12:00 1
## 182 24 29/03/2019 12:00 30/03/2019 12:00 1
## 183 24 29/03/2019 12:00 30/03/2019 12:00 1
## 184 24 29/03/2019 12:00 30/03/2019 12:00 1
## 185 24 29/03/2019 12:00 30/03/2019 12:00 1
## 186 24 29/03/2019 12:00 30/03/2019 12:00 1
## 187 24 30/03/2019 12:00 31/03/2019 12:00 1
## 188 24 30/03/2019 12:00 31/03/2019 12:00 1
## 189 24 30/03/2019 12:00 31/03/2019 12:00 1
## 190 24 30/03/2019 12:00 31/03/2019 12:00 1
## 191 24 30/03/2019 12:00 31/03/2019 12:00 1
## 192 24 30/03/2019 12:00 31/03/2019 12:00 1
## 193 24 30/03/2019 12:00 31/03/2019 12:00 1
## 194 24 30/03/2019 12:00 31/03/2019 12:00 1
## 195 24 30/03/2019 12:00 31/03/2019 12:00 1
## 196 24 30/03/2019 12:00 31/03/2019 12:00 1
## 197 24 30/03/2019 12:00 31/03/2019 12:00 1
## 198 24 30/03/2019 12:00 31/03/2019 12:00 1
## 199 24 30/03/2019 12:00 31/03/2019 12:00 1
## 200 24 30/03/2019 12:00 31/03/2019 12:00 1
## 201 24 30/03/2019 12:00 31/03/2019 12:00 1
## 202 24 30/03/2019 12:00 31/03/2019 12:00 1
## 203 24 29/03/2019 12:00 30/03/2019 12:00 1
## 204 24 29/03/2019 12:00 30/03/2019 12:00 1
## 205 24 29/03/2019 12:00 30/03/2019 12:00 1
## 206 24 29/03/2019 12:00 30/03/2019 12:00 1
## 207 24 29/03/2019 12:00 30/03/2019 12:00 1
## 208 24 29/03/2019 12:00 30/03/2019 12:00 1
## 209 24 29/03/2019 12:00 30/03/2019 12:00 1
## 210 24 29/03/2019 12:00 30/03/2019 12:00 1
## 211 24 29/03/2019 12:00 30/03/2019 12:00 1
## 212 24 29/03/2019 12:00 30/03/2019 12:00 1
## 213 24 29/03/2019 12:00 30/03/2019 12:00 1
## 214 24 29/03/2019 12:00 30/03/2019 12:00 1
## 215 24 29/03/2019 12:00 30/03/2019 12:00 1
## 216 24 29/03/2019 12:00 30/03/2019 12:00 1
## 217 24 29/03/2019 12:00 30/03/2019 12:00 1
## 218 24 30/03/2019 12:00 31/03/2019 12:00 1
## 219 24 30/03/2019 12:00 31/03/2019 12:00 1
## 220 24 30/03/2019 12:00 31/03/2019 12:00 1
## 221 24 30/03/2019 12:00 31/03/2019 12:00 1
## 222 24 30/03/2019 12:00 31/03/2019 12:00 1
## 223 24 30/03/2019 12:00 31/03/2019 12:00 1
## 224 24 30/03/2019 12:00 31/03/2019 12:00 1
## 225 24 30/03/2019 12:00 31/03/2019 12:00 1
## 226 24 30/03/2019 12:00 31/03/2019 12:00 1
## 227 24 30/03/2019 12:00 31/03/2019 12:00 1
## 228 24 30/03/2019 12:00 31/03/2019 12:00 1
## 229 24 30/03/2019 12:00 31/03/2019 12:00 1
## 230 24 30/03/2019 12:00 31/03/2019 12:00 1
## 231 24 30/03/2019 12:00 31/03/2019 12:00 1
## 232 24 30/03/2019 12:00 31/03/2019 12:00 1
## 233 24 30/03/2019 12:00 31/03/2019 12:00 1
## 234 24 30/03/2019 12:00 31/03/2019 12:00 1
## 235 24 30/03/2019 12:00 31/03/2019 12:00 1
## 236 24 30/03/2019 12:00 31/03/2019 12:00 1
## 237 24 30/03/2019 12:00 31/03/2019 12:00 1
## 238 24 30/03/2019 12:00 31/03/2019 12:00 1
## 239 24 30/03/2019 12:00 31/03/2019 12:00 1
## 240 24 30/03/2019 12:00 31/03/2019 12:00 1
## 241 24 30/03/2019 12:00 31/03/2019 12:00 1
## 242 24 30/03/2019 12:00 31/03/2019 12:00 1
## 243 24 30/03/2019 12:00 31/03/2019 12:00 1
## 244 24 30/03/2019 12:00 31/03/2019 12:00 1
## 245 24 30/03/2019 12:00 31/03/2019 12:00 1
## 246 24 30/03/2019 12:00 31/03/2019 12:00 1
## 247 24 30/03/2019 12:00 31/03/2019 12:00 1
## 248 24 30/03/2019 12:00 31/03/2019 12:00 1
## 249 24 30/03/2019 12:00 31/03/2019 12:00 1
## 250 24 30/03/2019 12:00 31/03/2019 12:00 1
## 251 24 30/03/2019 12:00 31/03/2019 12:00 1
## 252 24 30/03/2019 12:00 31/03/2019 12:00 1
## 253 24 29/03/2019 12:00 30/03/2019 12:00 1
## 254 24 29/03/2019 12:00 30/03/2019 12:00 1
## 255 24 29/03/2019 12:00 30/03/2019 12:00 1
## 256 24 29/03/2019 12:00 30/03/2019 12:00 1
## 257 24 29/03/2019 12:00 30/03/2019 12:00 1
## 258 24 29/03/2019 12:00 30/03/2019 12:00 1
## 259 24 29/03/2019 12:00 30/03/2019 12:00 1
## 260 24 29/03/2019 12:00 30/03/2019 12:00 1
## 261 24 29/03/2019 12:00 30/03/2019 12:00 1
## 262 24 29/03/2019 12:00 30/03/2019 12:00 1
## 263 24 29/03/2019 12:00 30/03/2019 12:00 1
## 264 24 29/03/2019 12:00 30/03/2019 12:00 1
## 265 24 29/03/2019 12:00 30/03/2019 12:00 1
## 266 24 29/03/2019 12:00 30/03/2019 12:00 1
## 267 24 29/03/2019 12:00 30/03/2019 12:00 1
## 268 24 29/03/2019 12:00 30/03/2019 12:00 1
## 269 24 29/03/2019 12:00 30/03/2019 12:00 1
## 270 24 29/03/2019 12:00 30/03/2019 12:00 1
## l_death l_surv pupal_ref p_emerg l_to_p_devtime
## 1 <NA> 0 1 26/04/2019 12:00 28
## 2 <NA> 0 3 27/04/2019 12:00 29
## 3 <NA> 0 4 28/04/2019 12:00 30
## 4 <NA> 0 7 02/05/2019 12:00 34
## 5 <NA> 0 2 28/04/2019 12:00 30
## 6 <NA> 0 8 02/05/2019 12:00 34
## 7 <NA> 0 12 03/05/2019 12:00 35
## 8 <NA> 0 13 04/05/2019 12:00 36
## 9 <NA> 0 6 30/04/2019 12:00 32
## 10 <NA> 0 14 04/05/2019 12:00 36
## 11 <NA> 0 3 27/04/2019 12:00 29
## 12 <NA> 0 4 28/04/2019 12:00 30
## 13 <NA> 0 7 01/05/2019 12:00 33
## 14 <NA> 0 12 04/05/2019 12:00 36
## 15 <NA> 0 13 04/05/2019 12:00 36
## 16 <NA> 0 5 30/04/2019 12:00 32
## 17 <NA> 0 8 02/05/2019 12:00 34
## 18 <NA> 0 9 03/05/2019 12:00 35
## 19 <NA> 0 10 03/05/2019 12:00 35
## 20 <NA> 0 11 04/05/2019 12:00 36
## 21 <NA> 0 10 03/05/2019 12:00 35
## 22 <NA> 0 11 03/05/2019 12:00 35
## 23 <NA> 0 12 05/05/2019 12:00 37
## 24 <NA> 0 5 03/04/2019 12:00 5
## 25 <NA> 0 6 06/04/2019 12:00 8
## 26 <NA> 0 7 06/04/2019 12:00 8
## 27 <NA> 0 6 04/04/2019 12:00 6
## 28 <NA> 0 7 04/04/2019 12:00 6
## 29 <NA> 0 8 04/04/2019 12:00 6
## 30 <NA> 0 16 07/04/2019 12:00 9
## 31 <NA> 0 17 07/04/2019 12:00 9
## 32 <NA> 0 1 05/04/2019 12:00 7
## 33 <NA> 0 2 06/04/2019 12:00 8
## 34 <NA> 0 4 06/04/2019 12:00 8
## 35 <NA> 0 10 06/04/2019 12:00 8
## 36 <NA> 0 11 06/04/2019 12:00 8
## 37 <NA> 0 14 07/04/2019 12:00 9
## 38 <NA> 0 16 07/04/2019 12:00 9
## 39 <NA> 0 18 07/04/2019 12:00 9
## 40 <NA> 0 20 07/04/2019 12:00 9
## 41 <NA> 0 21 07/04/2019 12:00 9
## 42 <NA> 0 23 07/04/2019 12:00 9
## 43 <NA> 0 24 07/04/2019 12:00 9
## 44 <NA> 0 25 08/04/2019 12:00 10
## 45 <NA> 0 26 10/04/2019 12:00 12
## 46 <NA> 0 5 06/04/2019 12:00 8
## 47 <NA> 0 11 06/04/2019 12:00 8
## 48 <NA> 0 12 06/04/2019 12:00 8
## 49 <NA> 0 13 07/04/2019 12:00 9
## 50 <NA> 0 14 07/04/2019 12:00 9
## 51 <NA> 0 16 07/04/2019 12:00 9
## 52 <NA> 0 19 07/04/2019 12:00 9
## 53 <NA> 0 20 07/04/2019 12:00 9
## 54 <NA> 0 21 07/04/2019 12:00 9
## 55 <NA> 0 22 07/04/2019 12:00 9
## 56 <NA> 0 25 08/04/2019 12:00 10
## 57 <NA> 0 1 17/04/2019 12:00 18
## 58 <NA> 0 2 19/04/2019 12:00 20
## 59 <NA> 0 4 24/04/2019 12:00 25
## 60 <NA> 0 9 04/04/2019 12:00 6
## 61 <NA> 0 5 26/04/2019 12:00 27
## 62 <NA> 0 6 27/04/2019 12:00 28
## 63 <NA> 0 8 30/04/2019 12:00 31
## 64 <NA> 0 9 01/05/2019 12:00 32
## 65 <NA> 0 10 30/04/2019 12:00 31
## 66 <NA> 0 11 25/04/2019 12:00 26
## 67 <NA> 0 12 30/04/2019 12:00 31
## 68 <NA> 0 1 18/04/2019 12:00 19
## 69 <NA> 0 2 22/04/2019 12:00 23
## 70 <NA> 0 4 24/04/2019 12:00 25
## 71 <NA> 0 6 25/04/2019 12:00 26
## 72 <NA> 0 7 26/04/2019 12:00 27
## 73 <NA> 0 9 26/04/2019 12:00 27
## 74 <NA> 0 11 26/04/2019 12:00 27
## 75 <NA> 0 12 27/04/2019 12:00 28
## 76 <NA> 0 13 24/04/2019 12:00 25
## 77 <NA> 0 2 19/04/2019 12:00 20
## 78 <NA> 0 3 20/04/2019 12:00 21
## 79 <NA> 0 4 24/04/2019 12:00 25
## 80 <NA> 0 6 25/04/2019 12:00 26
## 81 <NA> 0 8 26/04/2019 12:00 27
## 82 <NA> 0 9 01/05/2019 12:00 32
## 83 <NA> 0 11 04/04/2019 12:00 6
## 84 <NA> 0 12 04/04/2019 12:00 6
## 85 <NA> 0 15 04/04/2019 12:00 6
## 86 <NA> 0 18 07/04/2019 12:00 9
## 87 <NA> 0 19 07/04/2019 12:00 9
## 88 <NA> 0 6 04/04/2019 12:00 6
## 89 <NA> 0 9 04/04/2019 12:00 6
## 90 <NA> 0 11 04/04/2019 12:00 6
## 91 <NA> 0 18 04/04/2019 12:00 6
## 92 <NA> 0 20 04/04/2019 12:00 6
## 93 <NA> 0 23 04/04/2019 12:00 6
## 94 <NA> 0 27 05/04/2019 12:00 7
## 95 <NA> 0 5 04/04/2019 12:00 6
## 96 <NA> 0 6 04/04/2019 12:00 6
## 97 <NA> 0 7 04/04/2019 12:00 6
## 98 <NA> 0 8 04/04/2019 12:00 6
## 99 <NA> 0 11 04/04/2019 12:00 6
## 100 <NA> 0 18 05/04/2019 12:00 7
## 101 <NA> 0 19 05/04/2019 12:00 7
## 102 <NA> 0 20 05/04/2019 12:00 7
## 103 <NA> 0 21 05/04/2019 12:00 7
## 104 <NA> 0 22 05/04/2019 12:00 7
## 105 <NA> 0 23 05/04/2019 12:00 7
## 106 <NA> 0 3 19/04/2019 12:00 20
## 107 <NA> 0 20 07/04/2019 12:00 9
## 108 <NA> 0 6 08/04/2019 12:00 9
## 109 <NA> 0 2 19/04/2019 12:00 20
## 110 <NA> 0 4 23/04/2019 12:00 24
## 111 <NA> 0 6 25/04/2019 12:00 26
## 112 <NA> 0 1 08/04/2019 12:00 9
## 113 <NA> 0 2 12/04/2019 12:00 13
## 114 <NA> 0 8 24/04/2019 12:00 25
## 115 <NA> 0 9 24/04/2019 12:00 25
## 116 <NA> 0 9 23/04/2019 12:00 24
## 117 <NA> 0 21 07/04/2019 12:00 9
## 118 <NA> 0 4 02/04/2019 12:00 3
## 119 <NA> 0 8 03/04/2019 12:00 4
## 120 <NA> 0 19 05/04/2019 12:00 7
## 121 <NA> 0 12 03/04/2019 12:00 4
## 122 <NA> 0 13 03/04/2019 12:00 4
## 123 <NA> 0 22 08/04/2019 12:00 10
## 124 <NA> 0 14 03/04/2019 12:00 4
## 125 <NA> 0 20 05/04/2019 12:00 7
## 126 <NA> 0 23 08/04/2019 12:00 10
## 127 <NA> 0 17 03/04/2019 12:00 4
## 128 <NA> 0 18 03/04/2019 12:00 4
## 129 <NA> 0 24 08/04/2019 12:00 10
## 130 <NA> 0 21 03/04/2019 12:00 4
## 131 <NA> 0 22 04/04/2019 12:00 5
## 132 <NA> 0 23 05/04/2019 12:00 7
## 133 <NA> 0 1 03/04/2019 12:00 4
## 134 <NA> 0 2 03/04/2019 12:00 4
## 135 <NA> 0 3 03/04/2019 12:00 4
## 136 <NA> 0 4 03/04/2019 12:00 4
## 137 <NA> 0 6 03/04/2019 12:00 4
## 138 <NA> 0 7 03/04/2019 12:00 4
## 139 <NA> 0 8 03/04/2019 12:00 4
## 140 <NA> 0 11 03/04/2019 12:00 4
## 141 <NA> 0 12 03/04/2019 12:00 4
## 142 <NA> 0 13 03/04/2019 12:00 4
## 143 <NA> 0 14 03/04/2019 12:00 4
## 144 <NA> 0 15 03/04/2019 12:00 4
## 145 <NA> 0 21 05/04/2019 12:00 6
## 146 <NA> 0 3 03/04/2019 12:00 4
## 147 <NA> 0 4 03/04/2019 12:00 4
## 148 <NA> 0 7 03/04/2019 12:00 4
## 149 <NA> 0 11 03/04/2019 12:00 4
## 150 <NA> 0 16 03/04/2019 12:00 4
## 151 <NA> 0 15 17/04/2019 12:00 18
## 152 <NA> 0 16 24/04/2019 12:00 25
## 153 <NA> 0 24 06/04/2019 12:00 8
## 154 <NA> 0 27 10/04/2019 12:00 12
## 155 <NA> 0 11 26/04/2019 12:00 27
## 156 <NA> 0 12 27/04/2019 12:00 28
## 157 <NA> 0 13 06/05/2019 12:00 38
## 158 16/04/2019 12:00 1 NA <NA> NA
## 159 09/05/2019 12:00 1 NA <NA> NA
## 160 15/05/2019 12:00 1 NA <NA> NA
## 161 06/04/2019 12:00 1 NA <NA> NA
## 162 28/04/2019 12:00 1 NA <NA> NA
## 163 15/05/2019 12:00 1 NA <NA> NA
## 164 15/05/2019 12:00 1 NA <NA> NA
## 165 15/05/2019 12:00 1 NA <NA> NA
## 166 06/05/2019 12:00 1 NA <NA> NA
## 167 09/04/2019 12:00 1 NA <NA> NA
## 168 15/05/2019 12:00 1 NA <NA> NA
## 169 20/04/2019 12:00 1 NA <NA> NA
## 170 29/04/2019 12:00 1 NA <NA> NA
## 171 09/04/2019 12:00 1 NA <NA> NA
## 172 04/05/2019 12:00 1 NA <NA> NA
## 173 02/04/2019 12:00 1 NA <NA> NA
## 174 31/03/2019 12:00 1 NA <NA> NA
## 175 08/04/2019 12:00 1 NA <NA> NA
## 176 31/03/2019 12:00 1 NA <NA> NA
## 177 20/04/2019 12:00 1 NA <NA> NA
## 178 12/05/2019 12:00 1 NA <NA> NA
## 179 <NA> 0 26 08/04/2019 12:00 10
## 180 01/04/2019 12:00 1 NA <NA> NA
## 181 07/04/2019 12:00 1 NA <NA> NA
## 182 <NA> 0 22 07/04/2019 12:00 9
## 183 <NA> 0 17 07/04/2019 12:00 9
## 184 03/04/2019 12:00 1 NA <NA> NA
## 185 04/04/2019 12:00 1 NA <NA> NA
## 186 02/04/2019 12:00 1 NA <NA> NA
## 187 <NA> 0 17 02/05/2019 12:00 33
## 188 29/04/2019 12:00 1 NA <NA> NA
## 189 03/05/2019 12:00 1 NA <NA> NA
## 190 02/04/2019 12:00 1 NA <NA> NA
## 191 18/04/2019 12:00 1 NA <NA> NA
## 192 04/05/2019 12:00 1 NA <NA> NA
## 193 25/04/2019 12:00 1 NA <NA> NA
## 194 28/04/2019 12:00 1 NA <NA> NA
## 195 04/05/2019 12:00 1 NA <NA> NA
## 196 16/04/2019 12:00 1 NA <NA> NA
## 197 31/03/2019 12:00 1 NA <NA> NA
## 198 15/05/2019 12:00 1 NA <NA> NA
## 199 28/04/2019 12:00 1 NA <NA> NA
## 200 03/05/2019 12:00 1 NA <NA> NA
## 201 03/05/2019 12:00 1 NA <NA> NA
## 202 23/04/2019 12:00 1 NA <NA> NA
## 203 01/04/2019 12:00 1 NA <NA> NA
## 204 <NA> 0 21 05/04/2019 12:00 7
## 205 <NA> 0 9 04/04/2019 12:00 6
## 206 31/03/2019 12:00 1 NA <NA> NA
## 207 31/03/2019 12:00 1 NA <NA> NA
## 208 31/03/2019 12:00 1 NA <NA> NA
## 209 31/03/2019 12:00 1 NA <NA> NA
## 210 06/04/2019 12:00 1 NA <NA> NA
## 211 01/04/2019 12:00 1 NA <NA> NA
## 212 31/03/2019 12:00 1 NA <NA> NA
## 213 31/03/2019 12:00 1 NA <NA> NA
## 214 31/03/2019 12:00 1 NA <NA> NA
## 215 <NA> 0 1 03/04/2019 12:00 5
## 216 <NA> 0 13 04/04/2019 12:00 6
## 217 <NA> 0 15 04/04/2019 12:00 6
## 218 31/03/2019 12:00 1 NA <NA> NA
## 219 18/04/2019 12:00 1 NA <NA> NA
## 220 11/04/2019 12:00 1 NA <NA> NA
## 221 31/03/2019 12:00 1 NA <NA> NA
## 222 07/04/2019 12:00 1 NA <NA> NA
## 223 04/04/2019 12:00 1 NA <NA> NA
## 224 31/03/2019 12:00 1 NA <NA> NA
## 225 31/03/2019 12:00 1 NA <NA> NA
## 226 23/04/2019 12:00 1 NA <NA> NA
## 227 14/04/2019 12:00 1 NA <NA> NA
## 228 18/04/2019 12:00 1 NA <NA> NA
## 229 31/03/2019 12:00 1 NA <NA> NA
## 230 18/04/2019 12:00 1 NA <NA> NA
## 231 14/04/2019 12:00 1 NA <NA> NA
## 232 23/04/2019 12:00 1 NA <NA> NA
## 233 31/03/2019 12:00 1 NA <NA> NA
## 234 27/04/2019 12:00 1 NA <NA> NA
## 235 01/04/2019 12:00 1 NA <NA> NA
## 236 13/04/2019 12:00 1 NA <NA> NA
## 237 08/04/2019 12:00 1 NA <NA> NA
## 238 13/04/2019 12:00 1 NA <NA> NA
## 239 <NA> 0 4 17/04/2019 12:00 18
## 240 31/03/2019 12:00 1 NA <NA> NA
## 241 31/03/2019 12:00 1 NA <NA> NA
## 242 19/04/2019 12:00 1 NA <NA> NA
## 243 17/04/2019 12:00 1 NA <NA> NA
## 244 14/04/2019 12:00 1 NA <NA> NA
## 245 01/04/2019 12:00 1 NA <NA> NA
## 246 18/04/2019 12:00 1 NA <NA> NA
## 247 09/05/2019 12:00 1 NA <NA> NA
## 248 15/04/2019 12:00 1 NA <NA> NA
## 249 31/03/2019 12:00 1 NA <NA> NA
## 250 13/04/2019 12:00 1 NA <NA> NA
## 251 01/04/2019 12:00 1 NA <NA> NA
## 252 14/04/2019 12:00 1 NA <NA> NA
## 253 31/03/2019 12:00 1 NA <NA> NA
## 254 31/03/2019 12:00 1 NA <NA> NA
## 255 31/03/2019 12:00 1 NA <NA> NA
## 256 01/04/2019 12:00 1 NA <NA> NA
## 257 03/04/2019 12:00 1 NA <NA> NA
## 258 31/03/2019 12:00 1 NA <NA> NA
## 259 31/03/2019 12:00 1 NA <NA> NA
## 260 31/03/2019 12:00 1 NA <NA> NA
## 261 31/03/2019 12:00 1 NA <NA> NA
## 262 31/03/2019 12:00 1 NA <NA> NA
## 263 31/03/2019 12:00 1 NA <NA> NA
## 264 31/03/2019 12:00 1 NA <NA> NA
## 265 31/03/2019 12:00 1 NA <NA> NA
## 266 31/03/2019 12:00 1 NA <NA> NA
## 267 31/03/2019 12:00 1 NA <NA> NA
## 268 31/03/2019 12:00 1 NA <NA> NA
## 269 31/03/2019 12:00 1 NA <NA> NA
## 270 31/03/2019 12:00 1 NA <NA> NA
## l_to_p_devrate p_death p_surv a_emerg p_to_a_devtime
## 1 0.03571429 <NA> 0 29/04/2019 12:00 3
## 2 0.03448276 <NA> 0 30/04/2019 12:00 3
## 3 0.03333333 <NA> 0 01/05/2019 12:00 3
## 4 0.02941176 <NA> 0 05/05/2019 12:00 3
## 5 0.03333333 <NA> 0 01/05/2019 12:00 3
## 6 0.02941176 <NA> 0 05/05/2019 12:00 3
## 7 0.02857143 <NA> 0 06/05/2019 12:00 3
## 8 0.02777778 <NA> 0 08/05/2019 12:00 4
## 9 0.03125000 <NA> 0 03/05/2019 12:00 3
## 10 0.02777778 <NA> 0 07/05/2019 12:00 3
## 11 0.03448276 <NA> 0 30/04/2019 12:00 3
## 12 0.03333333 <NA> 0 01/05/2019 12:00 3
## 13 0.03030303 <NA> 0 05/05/2019 12:00 4
## 14 0.02777778 <NA> 0 07/05/2019 12:00 3
## 15 0.02777778 <NA> 0 07/05/2019 12:00 3
## 16 0.03125000 <NA> 0 03/05/2019 12:00 3
## 17 0.02941176 <NA> 0 05/05/2019 12:00 3
## 18 0.02857143 <NA> 0 06/05/2019 12:00 3
## 19 0.02857143 <NA> 0 06/05/2019 12:00 3
## 20 0.02777778 <NA> 0 07/05/2019 12:00 3
## 21 0.02857143 <NA> 0 06/05/2019 12:00 3
## 22 0.02857143 <NA> 0 06/05/2019 12:00 3
## 23 0.02702703 <NA> 0 09/05/2019 12:00 4
## 24 0.20000000 <NA> 0 06/04/2019 12:00 3
## 25 0.12500000 <NA> 0 09/04/2019 12:00 3
## 26 0.12500000 <NA> 0 09/04/2019 12:00 3
## 27 0.16666667 <NA> 0 06/04/2019 12:00 2
## 28 0.16666667 <NA> 0 06/04/2019 12:00 2
## 29 0.16666667 <NA> 0 07/04/2019 12:00 3
## 30 0.11111111 <NA> 0 10/04/2019 12:00 3
## 31 0.11111111 <NA> 0 10/04/2019 12:00 3
## 32 0.14285714 <NA> 0 09/04/2019 12:00 4
## 33 0.12500000 <NA> 0 09/04/2019 12:00 3
## 34 0.12500000 <NA> 0 09/04/2019 12:00 3
## 35 0.12500000 <NA> 0 09/04/2019 12:00 3
## 36 0.12500000 <NA> 0 09/04/2019 12:00 3
## 37 0.11111111 <NA> 0 10/04/2019 12:00 3
## 38 0.11111111 <NA> 0 10/04/2019 12:00 3
## 39 0.11111111 <NA> 0 10/04/2019 12:00 3
## 40 0.11111111 <NA> 0 10/04/2019 12:00 3
## 41 0.11111111 <NA> 0 10/04/2019 12:00 3
## 42 0.11111111 <NA> 0 11/04/2019 12:00 4
## 43 0.11111111 <NA> 0 10/04/2019 12:00 3
## 44 0.10000000 <NA> 0 12/04/2019 12:00 4
## 45 0.08333333 <NA> 0 14/04/2019 12:00 4
## 46 0.12500000 <NA> 0 09/04/2019 12:00 3
## 47 0.12500000 <NA> 0 09/04/2019 12:00 3
## 48 0.12500000 <NA> 0 10/04/2019 12:00 4
## 49 0.11111111 <NA> 0 10/04/2019 12:00 3
## 50 0.11111111 <NA> 0 10/04/2019 12:00 3
## 51 0.11111111 <NA> 0 10/04/2019 12:00 3
## 52 0.11111111 <NA> 0 10/04/2019 12:00 3
## 53 0.11111111 <NA> 0 10/04/2019 12:00 3
## 54 0.11111111 <NA> 0 10/04/2019 12:00 3
## 55 0.11111111 <NA> 0 10/04/2019 12:00 3
## 56 0.10000000 <NA> 0 11/04/2019 12:00 3
## 57 0.05555556 <NA> 0 20/04/2019 12:00 3
## 58 0.05000000 <NA> 0 22/04/2019 12:00 3
## 59 0.04000000 <NA> 0 27/04/2019 12:00 3
## 60 0.16666667 <NA> 0 06/04/2019 12:00 2
## 61 0.03703704 <NA> 0 28/04/2019 12:00 2
## 62 0.03571429 <NA> 0 29/04/2019 12:00 2
## 63 0.03225807 <NA> 0 02/05/2019 12:00 2
## 64 0.03125000 <NA> 0 04/05/2019 12:00 3
## 65 0.03225807 <NA> 0 02/05/2019 12:00 2
## 66 0.03846154 <NA> 0 27/04/2019 12:00 2
## 67 0.03225807 <NA> 0 02/05/2019 12:00 2
## 68 0.05263158 <NA> 0 20/04/2019 12:00 2
## 69 0.04347826 <NA> 0 24/04/2019 12:00 2
## 70 0.04000000 <NA> 0 26/04/2019 12:00 2
## 71 0.03846154 <NA> 0 27/04/2019 12:00 2
## 72 0.03703704 <NA> 0 29/04/2019 12:00 3
## 73 0.03703704 <NA> 0 28/04/2019 12:00 2
## 74 0.03703704 <NA> 0 28/04/2019 12:00 2
## 75 0.03571429 <NA> 0 29/04/2019 12:00 2
## 76 0.04000000 <NA> 0 26/04/2019 12:00 2
## 77 0.05000000 <NA> 0 21/04/2019 12:00 2
## 78 0.04761905 <NA> 0 22/04/2019 12:00 2
## 79 0.04000000 <NA> 0 26/04/2019 12:00 2
## 80 0.03846154 <NA> 0 27/04/2019 12:00 2
## 81 0.03703704 <NA> 0 28/04/2019 12:00 2
## 82 0.03125000 <NA> 0 03/05/2019 12:00 2
## 83 0.16666667 <NA> 0 06/04/2019 12:00 2
## 84 0.16666667 <NA> 0 06/04/2019 12:00 2
## 85 0.16666667 <NA> 0 07/04/2019 12:00 3
## 86 0.11111111 <NA> 0 10/04/2019 12:00 3
## 87 0.11111111 <NA> 0 10/04/2019 12:00 3
## 88 0.16666667 <NA> 0 06/04/2019 12:00 2
## 89 0.16666667 <NA> 0 06/04/2019 12:00 2
## 90 0.16666667 <NA> 0 06/04/2019 12:00 2
## 91 0.16666667 <NA> 0 06/04/2019 12:00 2
## 92 0.16666667 <NA> 0 06/04/2019 12:00 2
## 93 0.16666667 <NA> 0 07/04/2019 12:00 3
## 94 0.14285714 <NA> 0 07/04/2019 12:00 2
## 95 0.16666667 <NA> 0 06/04/2019 12:00 2
## 96 0.16666667 <NA> 0 06/04/2019 12:00 2
## 97 0.16666667 <NA> 0 06/04/2019 12:00 2
## 98 0.16666667 <NA> 0 06/04/2019 12:00 2
## 99 0.16666667 <NA> 0 06/04/2019 12:00 2
## 100 0.14285714 <NA> 0 07/04/2019 12:00 2
## 101 0.14285714 <NA> 0 07/04/2019 12:00 2
## 102 0.14285714 <NA> 0 07/04/2019 12:00 2
## 103 0.14285714 <NA> 0 06/04/2019 12:00 1
## 104 0.14285714 <NA> 0 08/04/2019 12:00 3
## 105 0.14285714 <NA> 0 07/04/2019 12:00 2
## 106 0.05000000 <NA> 0 21/04/2019 12:00 2
## 107 0.11111111 <NA> 0 10/04/2019 12:00 3
## 108 0.11111111 <NA> 0 09/04/2019 12:00 1
## 109 0.05000000 <NA> 0 20/04/2019 12:00 1
## 110 0.04166667 <NA> 0 25/04/2019 12:00 2
## 111 0.03846154 <NA> 0 26/04/2019 12:00 1
## 112 0.11111111 <NA> 0 10/04/2019 12:00 2
## 113 0.07692308 <NA> 0 14/04/2019 12:00 2
## 114 0.04000000 <NA> 0 25/04/2019 12:00 1
## 115 0.04000000 <NA> 0 25/04/2019 12:00 1
## 116 0.04166667 <NA> 0 25/04/2019 12:00 2
## 117 0.11111111 <NA> 0 11/04/2019 12:00 4
## 118 0.33333333 <NA> 0 04/04/2019 12:00 2
## 119 0.25000000 <NA> 0 05/04/2019 12:00 2
## 120 0.14285714 <NA> 0 07/04/2019 12:00 2
## 121 0.25000000 <NA> 0 05/04/2019 12:00 2
## 122 0.25000000 <NA> 0 05/04/2019 12:00 2
## 123 0.10000000 <NA> 0 11/04/2019 12:00 3
## 124 0.25000000 <NA> 0 04/04/2019 12:00 1
## 125 0.14285714 <NA> 0 08/04/2019 12:00 3
## 126 0.10000000 <NA> 0 11/04/2019 12:00 3
## 127 0.25000000 <NA> 0 05/04/2019 12:00 2
## 128 0.25000000 <NA> 0 05/04/2019 12:00 2
## 129 0.10000000 <NA> 0 11/04/2019 12:00 3
## 130 0.25000000 <NA> 0 05/04/2019 12:00 2
## 131 0.20000000 <NA> 0 05/04/2019 12:00 1
## 132 0.14285714 <NA> 0 07/04/2019 12:00 2
## 133 0.25000000 <NA> 0 05/04/2019 12:00 2
## 134 0.25000000 <NA> 0 04/04/2019 12:00 1
## 135 0.25000000 <NA> 0 05/04/2019 12:00 2
## 136 0.25000000 <NA> 0 05/04/2019 12:00 2
## 137 0.25000000 <NA> 0 05/04/2019 12:00 2
## 138 0.25000000 <NA> 0 04/04/2019 12:00 1
## 139 0.25000000 <NA> 0 04/04/2019 12:00 1
## 140 0.25000000 <NA> 0 05/04/2019 12:00 2
## 141 0.25000000 <NA> 0 05/04/2019 12:00 2
## 142 0.25000000 <NA> 0 05/04/2019 12:00 2
## 143 0.25000000 <NA> 0 05/04/2019 12:00 2
## 144 0.25000000 <NA> 0 05/04/2019 12:00 2
## 145 0.16666667 <NA> 0 06/04/2019 12:00 1
## 146 0.25000000 <NA> 0 05/04/2019 12:00 2
## 147 0.25000000 <NA> 0 04/04/2019 12:00 1
## 148 0.25000000 <NA> 0 05/04/2019 12:00 2
## 149 0.25000000 <NA> 0 05/04/2019 12:00 2
## 150 0.25000000 <NA> 0 05/04/2019 12:00 2
## 151 0.05555556 <NA> 0 19/04/2019 12:00 2
## 152 0.04000000 <NA> 0 26/04/2019 12:00 2
## 153 0.12500000 <NA> 0 08/04/2019 12:00 2
## 154 0.08333333 <NA> 0 14/04/2019 12:00 4
## 155 0.03703704 <NA> 0 28/04/2019 12:00 2
## 156 0.03571429 <NA> 0 29/04/2019 12:00 2
## 157 0.02631579 07/05/2019 12:00 1 <NA> NA
## 158 NA <NA> NA <NA> NA
## 159 NA <NA> NA <NA> NA
## 160 NA <NA> NA <NA> NA
## 161 NA <NA> NA <NA> NA
## 162 NA <NA> NA <NA> NA
## 163 NA <NA> NA <NA> NA
## 164 NA <NA> NA <NA> NA
## 165 NA <NA> NA <NA> NA
## 166 NA <NA> NA <NA> NA
## 167 NA <NA> NA <NA> NA
## 168 NA <NA> NA <NA> NA
## 169 NA <NA> NA <NA> NA
## 170 NA <NA> NA <NA> NA
## 171 NA <NA> NA <NA> NA
## 172 NA <NA> NA <NA> NA
## 173 NA <NA> NA <NA> NA
## 174 NA <NA> NA <NA> NA
## 175 NA <NA> NA <NA> NA
## 176 NA <NA> NA <NA> NA
## 177 NA <NA> NA <NA> NA
## 178 NA <NA> NA <NA> NA
## 179 0.10000000 12/04/2019 12:00 1 <NA> NA
## 180 NA <NA> NA <NA> NA
## 181 NA <NA> NA <NA> NA
## 182 0.11111111 08/04/2019 12:00 1 <NA> NA
## 183 0.11111111 10/04/2019 12:00 1 <NA> NA
## 184 NA <NA> NA <NA> NA
## 185 NA <NA> NA <NA> NA
## 186 NA <NA> NA <NA> NA
## 187 0.03030303 03/04/2019 12:00 1 <NA> NA
## 188 NA <NA> NA <NA> NA
## 189 NA <NA> NA <NA> NA
## 190 NA <NA> NA <NA> NA
## 191 NA <NA> NA <NA> NA
## 192 NA <NA> NA <NA> NA
## 193 NA <NA> NA <NA> NA
## 194 NA <NA> NA <NA> NA
## 195 NA <NA> NA <NA> NA
## 196 NA <NA> NA <NA> NA
## 197 NA <NA> NA <NA> NA
## 198 NA <NA> NA <NA> NA
## 199 NA <NA> NA <NA> NA
## 200 NA <NA> NA <NA> NA
## 201 NA <NA> NA <NA> NA
## 202 NA <NA> NA <NA> NA
## 203 NA <NA> NA <NA> NA
## 204 0.14285714 08/04/2019 12:00 1 <NA> NA
## 205 0.16666667 05/04/2019 12:00 1 <NA> NA
## 206 NA <NA> NA <NA> NA
## 207 NA <NA> NA <NA> NA
## 208 NA <NA> NA <NA> NA
## 209 NA <NA> NA <NA> NA
## 210 NA <NA> NA <NA> NA
## 211 NA <NA> NA <NA> NA
## 212 NA <NA> NA <NA> NA
## 213 NA <NA> NA <NA> NA
## 214 NA <NA> NA <NA> NA
## 215 0.20000000 05/04/2019 12:00 1 <NA> NA
## 216 0.16666667 06/04/2019 12:00 1 <NA> NA
## 217 0.16666667 05/04/2019 12:00 1 <NA> NA
## 218 NA <NA> NA <NA> NA
## 219 NA <NA> NA <NA> NA
## 220 NA <NA> NA <NA> NA
## 221 NA <NA> NA <NA> NA
## 222 NA <NA> NA <NA> NA
## 223 NA <NA> NA <NA> NA
## 224 NA <NA> NA <NA> NA
## 225 NA <NA> NA <NA> NA
## 226 NA <NA> NA <NA> NA
## 227 NA <NA> NA <NA> NA
## 228 NA <NA> NA <NA> NA
## 229 NA <NA> NA <NA> NA
## 230 NA <NA> NA <NA> NA
## 231 NA <NA> NA <NA> NA
## 232 NA <NA> NA <NA> NA
## 233 NA <NA> NA <NA> NA
## 234 NA <NA> NA <NA> NA
## 235 NA <NA> NA <NA> NA
## 236 NA <NA> NA <NA> NA
## 237 NA <NA> NA <NA> NA
## 238 NA <NA> NA <NA> NA
## 239 0.05555556 19/04/2019 12:00 1 <NA> NA
## 240 NA <NA> NA <NA> NA
## 241 NA <NA> NA <NA> NA
## 242 NA <NA> NA <NA> NA
## 243 NA <NA> NA <NA> NA
## 244 NA <NA> NA <NA> NA
## 245 NA <NA> NA <NA> NA
## 246 NA <NA> NA <NA> NA
## 247 NA <NA> NA <NA> NA
## 248 NA <NA> NA <NA> NA
## 249 NA <NA> NA <NA> NA
## 250 NA <NA> NA <NA> NA
## 251 NA <NA> NA <NA> NA
## 252 NA <NA> NA <NA> NA
## 253 NA <NA> NA <NA> NA
## 254 NA <NA> NA <NA> NA
## 255 NA <NA> NA <NA> NA
## 256 NA <NA> NA <NA> NA
## 257 NA <NA> NA <NA> NA
## 258 NA <NA> NA <NA> NA
## 259 NA <NA> NA <NA> NA
## 260 NA <NA> NA <NA> NA
## 261 NA <NA> NA <NA> NA
## 262 NA <NA> NA <NA> NA
## 263 NA <NA> NA <NA> NA
## 264 NA <NA> NA <NA> NA
## 265 NA <NA> NA <NA> NA
## 266 NA <NA> NA <NA> NA
## 267 NA <NA> NA <NA> NA
## 268 NA <NA> NA <NA> NA
## 269 NA <NA> NA <NA> NA
## 270 NA <NA> NA <NA> NA
## p_to_a_devrate a_death adult_lifespan adult_mort_rate
## 1 0.3333333 07/05/2019 12:00 8 0.12500000
## 2 0.3333333 05/05/2019 12:00 5 0.20000000
## 3 0.3333333 06/05/2019 12:00 5 0.20000000
## 4 0.3333333 11/05/2019 12:00 6 0.16666667
## 5 0.3333333 05/05/2019 12:00 4 0.25000000
## 6 0.3333333 11/05/2019 12:00 6 0.16666667
## 7 0.3333333 15/05/2019 12:00 9 0.11111111
## 8 0.2500000 09/05/2019 12:00 1 1.00000000
## 9 0.3333333 10/05/2019 12:00 7 0.14285714
## 10 0.3333333 14/05/2019 12:00 7 0.14285714
## 11 0.3333333 07/05/2019 12:00 7 0.14285714
## 12 0.3333333 06/05/2019 12:00 5 0.20000000
## 13 0.2500000 14/05/2019 12:00 9 0.11111111
## 14 0.3333333 14/05/2019 12:00 7 0.14285714
## 15 0.3333333 14/05/2019 12:00 7 0.14285714
## 16 0.3333333 09/05/2019 12:00 6 0.16666667
## 17 0.3333333 13/05/2019 12:00 8 0.12500000
## 18 0.3333333 15/05/2019 12:00 9 0.11111111
## 19 0.3333333 17/05/2019 12:00 11 0.09090909
## 20 0.3333333 13/05/2019 12:00 6 0.16666667
## 21 0.3333333 14/05/2019 12:00 8 0.12500000
## 22 0.3333333 14/05/2019 12:00 8 0.12500000
## 23 0.2500000 12/05/2019 12:00 3 0.33333333
## 24 0.3333333 12/04/2019 12:00 6 0.16666667
## 25 0.3333333 24/04/2019 12:00 15 0.06666667
## 26 0.3333333 20/04/2019 12:00 11 0.09090909
## 27 0.5000000 14/04/2019 12:00 8 0.12500000
## 28 0.5000000 10/04/2019 12:00 4 0.25000000
## 29 0.3333333 13/04/2019 12:00 6 0.16666667
## 30 0.3333333 24/04/2019 12:00 14 0.07142857
## 31 0.3333333 22/04/2019 12:00 12 0.08333333
## 32 0.2500000 20/04/2019 12:00 11 0.09090909
## 33 0.3333333 21/04/2019 12:00 12 0.08333333
## 34 0.3333333 18/04/2019 12:00 9 0.11111111
## 35 0.3333333 21/04/2019 12:00 12 0.08333333
## 36 0.3333333 20/04/2019 12:00 11 0.09090909
## 37 0.3333333 18/04/2019 12:00 8 0.12500000
## 38 0.3333333 23/04/2019 12:00 13 0.07692308
## 39 0.3333333 21/04/2019 12:00 11 0.09090909
## 40 0.3333333 22/04/2019 12:00 12 0.08333333
## 41 0.3333333 25/04/2019 12:00 15 0.06666667
## 42 0.2500000 24/04/2019 12:00 13 0.07692308
## 43 0.3333333 23/04/2019 12:00 13 0.07692308
## 44 0.2500000 24/04/2019 12:00 12 0.08333333
## 45 0.2500000 29/04/2019 12:00 15 0.06666667
## 46 0.3333333 18/04/2019 12:00 9 0.11111111
## 47 0.3333333 19/04/2019 12:00 10 0.10000000
## 48 0.2500000 21/04/2019 12:00 11 0.09090909
## 49 0.3333333 20/04/2019 12:00 10 0.10000000
## 50 0.3333333 21/04/2019 12:00 11 0.09090909
## 51 0.3333333 17/04/2019 12:00 7 0.14285714
## 52 0.3333333 21/04/2019 12:00 11 0.09090909
## 53 0.3333333 21/04/2019 12:00 11 0.09090909
## 54 0.3333333 20/04/2019 12:00 10 0.10000000
## 55 0.3333333 20/04/2019 12:00 10 0.10000000
## 56 0.3333333 21/04/2019 12:00 10 0.10000000
## 57 0.3333333 26/04/2019 12:00 6 0.16666667
## 58 0.3333333 01/05/2019 12:00 9 0.11111111
## 59 0.3333333 04/05/2019 12:00 7 0.14285714
## 60 0.5000000 12/04/2019 12:00 6 0.16666667
## 61 0.5000000 02/05/2019 12:00 4 0.25000000
## 62 0.5000000 02/05/2019 12:00 3 0.33333333
## 63 0.5000000 07/05/2019 12:00 5 0.20000000
## 64 0.3333333 11/05/2019 12:00 7 0.14285714
## 65 0.5000000 06/05/2019 12:00 4 0.25000000
## 66 0.5000000 02/05/2019 12:00 5 0.20000000
## 67 0.5000000 06/05/2019 12:00 4 0.25000000
## 68 0.5000000 28/04/2019 12:00 8 0.12500000
## 69 0.5000000 02/05/2019 12:00 8 0.12500000
## 70 0.5000000 02/05/2019 12:00 6 0.16666667
## 71 0.5000000 02/05/2019 12:00 5 0.20000000
## 72 0.3333333 02/05/2019 12:00 3 0.33333333
## 73 0.5000000 02/05/2019 12:00 4 0.25000000
## 74 0.5000000 01/05/2019 12:00 3 0.33333333
## 75 0.5000000 03/05/2019 12:00 4 0.25000000
## 76 0.5000000 30/04/2019 12:00 4 0.25000000
## 77 0.5000000 26/04/2019 12:00 5 0.20000000
## 78 0.5000000 29/04/2019 12:00 7 0.14285714
## 79 0.5000000 01/05/2019 12:00 5 0.20000000
## 80 0.5000000 05/05/2019 12:00 8 0.12500000
## 81 0.5000000 05/05/2019 12:00 7 0.14285714
## 82 0.5000000 12/05/2019 12:00 9 0.11111111
## 83 0.5000000 12/04/2019 12:00 6 0.16666667
## 84 0.5000000 13/04/2019 12:00 7 0.14285714
## 85 0.3333333 13/04/2019 12:00 6 0.16666667
## 86 0.3333333 19/04/2019 12:00 9 0.11111111
## 87 0.3333333 22/04/2019 12:00 12 0.08333333
## 88 0.5000000 12/04/2019 12:00 6 0.16666667
## 89 0.5000000 12/04/2019 12:00 6 0.16666667
## 90 0.5000000 13/04/2019 12:00 7 0.14285714
## 91 0.5000000 13/04/2019 12:00 7 0.14285714
## 92 0.5000000 12/04/2019 12:00 6 0.16666667
## 93 0.3333333 13/04/2019 12:00 6 0.16666667
## 94 0.5000000 14/04/2019 12:00 7 0.14285714
## 95 0.5000000 12/04/2019 12:00 6 0.16666667
## 96 0.5000000 12/04/2019 12:00 6 0.16666667
## 97 0.5000000 12/04/2019 12:00 6 0.16666667
## 98 0.5000000 12/04/2019 12:00 6 0.16666667
## 99 0.5000000 12/04/2019 12:00 6 0.16666667
## 100 0.5000000 13/04/2019 12:00 6 0.16666667
## 101 0.5000000 16/04/2019 12:00 9 0.11111111
## 102 0.5000000 14/04/2019 12:00 7 0.14285714
## 103 1.0000000 13/04/2019 12:00 7 0.14285714
## 104 0.3333333 17/04/2019 12:00 9 0.11111111
## 105 0.5000000 17/04/2019 12:00 10 0.10000000
## 106 0.5000000 23/04/2019 12:00 2 0.50000000
## 107 0.3333333 20/04/2019 12:00 10 0.10000000
## 108 1.0000000 10/04/2019 12:00 1 1.00000000
## 109 1.0000000 22/04/2019 12:00 2 0.50000000
## 110 0.5000000 27/04/2019 12:00 2 0.50000000
## 111 1.0000000 27/04/2019 12:00 1 1.00000000
## 112 0.5000000 13/04/2019 12:00 3 0.33333333
## 113 0.5000000 17/04/2019 12:00 3 0.33333333
## 114 1.0000000 27/04/2019 12:00 2 0.50000000
## 115 1.0000000 27/04/2019 12:00 2 0.50000000
## 116 0.5000000 26/04/2019 12:00 1 1.00000000
## 117 0.2500000 24/04/2019 12:00 13 0.07692308
## 118 0.5000000 07/04/2019 12:00 3 0.33333333
## 119 0.5000000 07/04/2019 12:00 2 0.50000000
## 120 0.5000000 15/04/2019 12:00 8 0.12500000
## 121 0.5000000 09/04/2019 12:00 4 0.25000000
## 122 0.5000000 09/04/2019 12:00 4 0.25000000
## 123 0.3333333 23/04/2019 12:00 12 0.08333333
## 124 1.0000000 07/04/2019 12:00 3 0.33333333
## 125 0.3333333 15/04/2019 12:00 7 0.14285714
## 126 0.3333333 18/04/2019 12:00 7 0.14285714
## 127 0.5000000 08/04/2019 12:00 3 0.33333333
## 128 0.5000000 06/04/2019 12:00 1 1.00000000
## 129 0.3333333 25/04/2019 12:00 14 0.07142857
## 130 0.5000000 07/04/2019 12:00 2 0.50000000
## 131 1.0000000 07/04/2019 12:00 2 0.50000000
## 132 0.5000000 13/04/2019 12:00 6 0.16666667
## 133 0.5000000 09/04/2019 12:00 4 0.25000000
## 134 1.0000000 06/04/2019 12:00 2 0.50000000
## 135 0.5000000 08/04/2019 12:00 3 0.33333333
## 136 0.5000000 08/04/2019 12:00 3 0.33333333
## 137 0.5000000 09/04/2019 12:00 4 0.25000000
## 138 1.0000000 08/04/2019 12:00 4 0.25000000
## 139 1.0000000 08/04/2019 12:00 4 0.25000000
## 140 0.5000000 08/04/2019 12:00 3 0.33333333
## 141 0.5000000 07/04/2019 12:00 2 0.50000000
## 142 0.5000000 08/04/2019 12:00 3 0.33333333
## 143 0.5000000 09/04/2019 12:00 4 0.25000000
## 144 0.5000000 07/04/2019 12:00 2 0.50000000
## 145 1.0000000 07/04/2019 12:00 1 1.00000000
## 146 0.5000000 07/04/2019 12:00 2 0.50000000
## 147 1.0000000 07/04/2019 12:00 3 0.33333333
## 148 0.5000000 08/04/2019 12:00 3 0.33333333
## 149 0.5000000 06/04/2019 12:00 1 1.00000000
## 150 0.5000000 07/04/2019 12:00 2 0.50000000
## 151 0.5000000 26/04/2019 12:00 7 0.14285714
## 152 0.5000000 30/04/2019 12:00 4 0.25000000
## 153 0.5000000 16/04/2019 12:00 8 0.12500000
## 154 0.2500000 28/04/2019 12:00 14 0.07142857
## 155 0.5000000 02/05/2019 12:00 4 0.25000000
## 156 0.5000000 06/05/2019 12:00 7 0.14285714
## 157 NA <NA> NA NA
## 158 NA <NA> NA NA
## 159 NA <NA> NA NA
## 160 NA <NA> NA NA
## 161 NA <NA> NA NA
## 162 NA <NA> NA NA
## 163 NA <NA> NA NA
## 164 NA <NA> NA NA
## 165 NA <NA> NA NA
## 166 NA <NA> NA NA
## 167 NA <NA> NA NA
## 168 NA <NA> NA NA
## 169 NA <NA> NA NA
## 170 NA <NA> NA NA
## 171 NA <NA> NA NA
## 172 NA <NA> NA NA
## 173 NA <NA> NA NA
## 174 NA <NA> NA NA
## 175 NA <NA> NA NA
## 176 NA <NA> NA NA
## 177 NA <NA> NA NA
## 178 NA <NA> NA NA
## 179 NA <NA> NA NA
## 180 NA <NA> NA NA
## 181 NA <NA> NA NA
## 182 NA <NA> NA NA
## 183 NA <NA> NA NA
## 184 NA <NA> NA NA
## 185 NA <NA> NA NA
## 186 NA <NA> NA NA
## 187 NA <NA> NA NA
## 188 NA <NA> NA NA
## 189 NA <NA> NA NA
## 190 NA <NA> NA NA
## 191 NA <NA> NA NA
## 192 NA <NA> NA NA
## 193 NA <NA> NA NA
## 194 NA <NA> NA NA
## 195 NA <NA> NA NA
## 196 NA <NA> NA NA
## 197 NA <NA> NA NA
## 198 NA <NA> NA NA
## 199 NA <NA> NA NA
## 200 NA <NA> NA NA
## 201 NA <NA> NA NA
## 202 NA <NA> NA NA
## 203 NA <NA> NA NA
## 204 NA <NA> NA NA
## 205 NA <NA> NA NA
## 206 NA <NA> NA NA
## 207 NA <NA> NA NA
## 208 NA <NA> NA NA
## 209 NA <NA> NA NA
## 210 NA <NA> NA NA
## 211 NA <NA> NA NA
## 212 NA <NA> NA NA
## 213 NA <NA> NA NA
## 214 NA <NA> NA NA
## 215 NA <NA> NA NA
## 216 NA <NA> NA NA
## 217 NA <NA> NA NA
## 218 NA <NA> NA NA
## 219 NA <NA> NA NA
## 220 NA <NA> NA NA
## 221 NA <NA> NA NA
## 222 NA <NA> NA NA
## 223 NA <NA> NA NA
## 224 NA <NA> NA NA
## 225 NA <NA> NA NA
## 226 NA <NA> NA NA
## 227 NA <NA> NA NA
## 228 NA <NA> NA NA
## 229 NA <NA> NA NA
## 230 NA <NA> NA NA
## 231 NA <NA> NA NA
## 232 NA <NA> NA NA
## 233 NA <NA> NA NA
## 234 NA <NA> NA NA
## 235 NA <NA> NA NA
## 236 NA <NA> NA NA
## 237 NA <NA> NA NA
## 238 NA <NA> NA NA
## 239 NA <NA> NA NA
## 240 NA <NA> NA NA
## 241 NA <NA> NA NA
## 242 NA <NA> NA NA
## 243 NA <NA> NA NA
## 244 NA <NA> NA NA
## 245 NA <NA> NA NA
## 246 NA <NA> NA NA
## 247 NA <NA> NA NA
## 248 NA <NA> NA NA
## 249 NA <NA> NA NA
## 250 NA <NA> NA NA
## 251 NA <NA> NA NA
## 252 NA <NA> NA NA
## 253 NA <NA> NA NA
## 254 NA <NA> NA NA
## 255 NA <NA> NA NA
## 256 NA <NA> NA NA
## 257 NA <NA> NA NA
## 258 NA <NA> NA NA
## 259 NA <NA> NA NA
## 260 NA <NA> NA NA
## 261 NA <NA> NA NA
## 262 NA <NA> NA NA
## 263 NA <NA> NA NA
## 264 NA <NA> NA NA
## 265 NA <NA> NA NA
## 266 NA <NA> NA NA
## 267 NA <NA> NA NA
## 268 NA <NA> NA NA
## 269 NA <NA> NA NA
## 270 NA <NA> NA NA
## hatch_to_a_devtime hatch_to_adult_devrate sex im_surv j_lifetime
## 1 31 0.03225807 female 0 31
## 2 32 0.03125000 female 0 32
## 3 33 0.03030303 female 0 33
## 4 37 0.02702703 female 0 37
## 5 33 0.03030303 female 0 33
## 6 37 0.02702703 female 0 37
## 7 38 0.02631579 female 0 38
## 8 40 0.02500000 female 0 40
## 9 35 0.02857143 female 0 35
## 10 39 0.02564103 female 0 39
## 11 32 0.03125000 female 0 32
## 12 33 0.03030303 female 0 33
## 13 37 0.02702703 female 0 37
## 14 39 0.02564103 female 0 39
## 15 39 0.02564103 female 0 39
## 16 35 0.02857143 female 0 35
## 17 37 0.02702703 female 0 37
## 18 38 0.02631579 female 0 38
## 19 38 0.02631579 female 0 38
## 20 39 0.02564103 female 0 39
## 21 38 0.02631579 female 0 38
## 22 38 0.02631579 female 0 38
## 23 41 0.02439024 female 0 41
## 24 8 0.12500000 female 0 8
## 25 11 0.09090909 female 0 11
## 26 11 0.09090909 female 0 11
## 27 8 0.12500000 female 0 8
## 28 8 0.12500000 female 0 8
## 29 9 0.11111111 female 0 9
## 30 12 0.08333333 female 0 12
## 31 12 0.08333333 female 0 12
## 32 11 0.09090909 female 0 11
## 33 11 0.09090909 female 0 11
## 34 11 0.09090909 female 0 11
## 35 11 0.09090909 female 0 11
## 36 11 0.09090909 female 0 11
## 37 12 0.08333333 female 0 12
## 38 12 0.08333333 female 0 12
## 39 12 0.08333333 female 0 12
## 40 12 0.08333333 female 0 12
## 41 12 0.08333333 female 0 12
## 42 13 0.07692308 female 0 13
## 43 12 0.08333333 female 0 12
## 44 14 0.07142857 female 0 14
## 45 16 0.06250000 female 0 16
## 46 11 0.09090909 female 0 11
## 47 11 0.09090909 female 0 11
## 48 12 0.08333333 female 0 12
## 49 12 0.08333333 female 0 12
## 50 12 0.08333333 female 0 12
## 51 12 0.08333333 female 0 12
## 52 12 0.08333333 female 0 12
## 53 12 0.08333333 female 0 12
## 54 12 0.08333333 female 0 12
## 55 12 0.08333333 female 0 12
## 56 13 0.07692308 female 0 13
## 57 21 0.04761905 female 0 21
## 58 23 0.04347826 female 0 23
## 59 28 0.03571429 female 0 28
## 60 8 0.12500000 female 0 8
## 61 29 0.03448276 female 0 29
## 62 30 0.03333333 female 0 30
## 63 33 0.03030303 female 0 33
## 64 35 0.02857143 female 0 35
## 65 33 0.03030303 female 0 33
## 66 28 0.03571429 female 0 28
## 67 33 0.03030303 female 0 33
## 68 21 0.04761905 female 0 21
## 69 25 0.04000000 female 0 25
## 70 27 0.03703704 female 0 27
## 71 28 0.03571429 female 0 28
## 72 30 0.03333333 female 0 30
## 73 29 0.03448276 female 0 29
## 74 29 0.03448276 female 0 29
## 75 30 0.03333333 female 0 30
## 76 27 0.03703704 female 0 27
## 77 22 0.04545454 female 0 22
## 78 23 0.04347826 female 0 23
## 79 27 0.03703704 female 0 27
## 80 28 0.03571429 female 0 28
## 81 29 0.03448276 female 0 29
## 82 34 0.02941176 female 0 34
## 83 8 0.12500000 female 0 8
## 84 8 0.12500000 female 0 8
## 85 9 0.11111111 female 0 9
## 86 12 0.08333333 female 0 12
## 87 12 0.08333333 female 0 12
## 88 8 0.12500000 female 0 8
## 89 8 0.12500000 female 0 8
## 90 8 0.12500000 female 0 8
## 91 8 0.12500000 female 0 8
## 92 8 0.12500000 female 0 8
## 93 9 0.11111111 female 0 9
## 94 9 0.11111111 female 0 9
## 95 8 0.12500000 female 0 8
## 96 8 0.12500000 female 0 8
## 97 8 0.12500000 female 0 8
## 98 8 0.12500000 female 0 8
## 99 8 0.12500000 female 0 8
## 100 9 0.11111111 female 0 9
## 101 9 0.11111111 female 0 9
## 102 9 0.11111111 female 0 9
## 103 8 0.12500000 female 0 8
## 104 10 0.10000000 female 0 10
## 105 9 0.11111111 female 0 9
## 106 22 0.04545454 female 0 22
## 107 12 0.08333333 female 0 12
## 108 10 0.10000000 female 0 10
## 109 21 0.04761905 female 0 21
## 110 26 0.03846154 female 0 26
## 111 27 0.03703704 female 0 27
## 112 11 0.09090909 female 0 11
## 113 15 0.06666667 female 0 15
## 114 26 0.03846154 female 0 26
## 115 26 0.03846154 female 0 26
## 116 26 0.03846154 female 0 26
## 117 13 0.07692308 female 0 13
## 118 5 0.20000000 female 0 5
## 119 6 0.16666667 female 0 6
## 120 9 0.11111111 female 0 9
## 121 6 0.16666667 female 0 6
## 122 6 0.16666667 female 0 6
## 123 13 0.07692308 female 0 13
## 124 5 0.20000000 female 0 5
## 125 10 0.10000000 female 0 10
## 126 13 0.07692308 female 0 13
## 127 6 0.16666667 female 0 6
## 128 6 0.16666667 female 0 6
## 129 13 0.07692308 female 0 13
## 130 6 0.16666667 female 0 6
## 131 6 0.16666667 female 0 6
## 132 9 0.11111111 female 0 9
## 133 6 0.16666667 female 0 6
## 134 5 0.20000000 female 0 5
## 135 6 0.16666667 female 0 6
## 136 6 0.16666667 female 0 6
## 137 6 0.16666667 female 0 6
## 138 5 0.20000000 female 0 5
## 139 5 0.20000000 female 0 5
## 140 6 0.16666667 female 0 6
## 141 6 0.16666667 female 0 6
## 142 6 0.16666667 female 0 6
## 143 6 0.16666667 female 0 6
## 144 6 0.16666667 female 0 6
## 145 7 0.14285714 female 0 7
## 146 6 0.16666667 female 0 6
## 147 5 0.20000000 female 0 5
## 148 6 0.16666667 female 0 6
## 149 6 0.16666667 female 0 6
## 150 6 0.16666667 female 0 6
## 151 20 0.05000000 female 0 20
## 152 27 0.03703704 female 0 27
## 153 10 0.10000000 female 0 10
## 154 16 0.06250000 female 0 16
## 155 29 0.03448276 female 0 29
## 156 30 0.03333333 female 0 30
## 157 NA NA <NA> 1 39
## 158 NA NA <NA> 1 18
## 159 NA NA <NA> 1 41
## 160 NA NA <NA> 1 47
## 161 NA NA <NA> 1 8
## 162 NA NA <NA> 1 30
## 163 NA NA <NA> 1 47
## 164 NA NA <NA> 1 47
## 165 NA NA <NA> 1 47
## 166 NA NA <NA> 1 38
## 167 NA NA <NA> 1 11
## 168 NA NA <NA> 1 47
## 169 NA NA <NA> 1 22
## 170 NA NA <NA> 1 31
## 171 NA NA <NA> 1 11
## 172 NA NA <NA> 1 36
## 173 NA NA <NA> 1 4
## 174 NA NA <NA> 1 2
## 175 NA NA <NA> 1 10
## 176 NA NA <NA> 1 2
## 177 NA NA <NA> 1 22
## 178 NA NA <NA> 1 44
## 179 NA NA <NA> 1 14
## 180 NA NA <NA> 1 3
## 181 NA NA <NA> 1 9
## 182 NA NA <NA> 1 10
## 183 NA NA <NA> 1 12
## 184 NA NA <NA> 1 5
## 185 NA NA <NA> 1 6
## 186 NA NA <NA> 1 4
## 187 NA NA <NA> 1 4
## 188 NA NA <NA> 1 30
## 189 NA NA <NA> 1 34
## 190 NA NA <NA> 1 3
## 191 NA NA <NA> 1 19
## 192 NA NA <NA> 1 35
## 193 NA NA <NA> 1 26
## 194 NA NA <NA> 1 29
## 195 NA NA <NA> 1 35
## 196 NA NA <NA> 1 17
## 197 NA NA <NA> 1 1
## 198 NA NA <NA> 1 46
## 199 NA NA <NA> 1 29
## 200 NA NA <NA> 1 34
## 201 NA NA <NA> 1 34
## 202 NA NA <NA> 1 24
## 203 NA NA <NA> 1 3
## 204 NA NA <NA> 1 10
## 205 NA NA <NA> 1 7
## 206 NA NA <NA> 1 2
## 207 NA NA <NA> 1 2
## 208 NA NA <NA> 1 2
## 209 NA NA <NA> 1 2
## 210 NA NA <NA> 1 8
## 211 NA NA <NA> 1 3
## 212 NA NA <NA> 1 2
## 213 NA NA <NA> 1 2
## 214 NA NA <NA> 1 2
## 215 NA NA <NA> 1 7
## 216 NA NA <NA> 1 8
## 217 NA NA <NA> 1 7
## 218 NA NA <NA> 1 1
## 219 NA NA <NA> 1 19
## 220 NA NA <NA> 1 12
## 221 NA NA <NA> 1 1
## 222 NA NA <NA> 1 8
## 223 NA NA <NA> 1 5
## 224 NA NA <NA> 1 1
## 225 NA NA <NA> 1 1
## 226 NA NA <NA> 1 24
## 227 NA NA <NA> 1 15
## 228 NA NA <NA> 1 19
## 229 NA NA <NA> 1 1
## 230 NA NA <NA> 1 19
## 231 NA NA <NA> 1 15
## 232 NA NA <NA> 1 24
## 233 NA NA <NA> 1 1
## 234 NA NA <NA> 1 28
## 235 NA NA <NA> 1 2
## 236 NA NA <NA> 1 14
## 237 NA NA <NA> 1 9
## 238 NA NA <NA> 1 14
## 239 NA NA <NA> 1 20
## 240 NA NA <NA> 1 1
## 241 NA NA <NA> 1 1
## 242 NA NA <NA> 1 20
## 243 NA NA <NA> 1 18
## 244 NA NA <NA> 1 15
## 245 NA NA <NA> 1 2
## 246 NA NA <NA> 1 19
## 247 NA NA <NA> 1 40
## 248 NA NA <NA> 1 16
## 249 NA NA <NA> 1 1
## 250 NA NA <NA> 1 14
## 251 NA NA <NA> 1 2
## 252 NA NA <NA> 1 15
## 253 NA NA <NA> 1 2
## 254 NA NA <NA> 1 2
## 255 NA NA <NA> 1 2
## 256 NA NA <NA> 1 3
## 257 NA NA <NA> 1 5
## 258 NA NA <NA> 1 2
## 259 NA NA <NA> 1 2
## 260 NA NA <NA> 1 2
## 261 NA NA <NA> 1 2
## 262 NA NA <NA> 1 2
## 263 NA NA <NA> 1 2
## 264 NA NA <NA> 1 2
## 265 NA NA <NA> 1 2
## 266 NA NA <NA> 1 2
## 267 NA NA <NA> 1 2
## 268 NA NA <NA> 1 2
## 269 NA NA <NA> 1 2
## 270 NA NA <NA> 1 2
## total_lifespan length_mm
## 1 39 2.93
## 2 37 2.59
## 3 38 2.41
## 4 43 2.62
## 5 37 2.28
## 6 43 2.39
## 7 47 2.99
## 8 41 2.50
## 9 42 2.75
## 10 46 2.82
## 11 39 2.66
## 12 38 2.31
## 13 46 2.84
## 14 46 2.79
## 15 46 2.68
## 16 41 2.33
## 17 45 2.84
## 18 47 2.73
## 19 49 2.91
## 20 45 2.79
## 21 46 2.89
## 22 46 2.66
## 23 44 2.30
## 24 14 3.06
## 25 26 3.23
## 26 22 3.12
## 27 16 3.02
## 28 12 2.85
## 29 15 2.80
## 30 26 3.23
## 31 24 3.17
## 32 22 3.09
## 33 23 3.08
## 34 20 3.14
## 35 23 3.27
## 36 22 3.26
## 37 20 3.03
## 38 25 NA
## 39 23 3.11
## 40 24 3.10
## 41 27 3.19
## 42 26 NA
## 43 25 3.13
## 44 26 3.26
## 45 31 2.97
## 46 20 3.05
## 47 21 3.20
## 48 23 3.21
## 49 22 3.09
## 50 23 3.06
## 51 19 3.13
## 52 23 3.19
## 53 23 3.21
## 54 22 3.02
## 55 22 3.08
## 56 23 3.27
## 57 27 2.07
## 58 32 2.30
## 59 35 2.67
## 60 14 NA
## 61 33 2.41
## 62 33 2.40
## 63 38 2.48
## 64 42 2.66
## 65 37 2.36
## 66 33 2.29
## 67 37 2.24
## 68 29 2.45
## 69 33 2.62
## 70 33 2.73
## 71 33 2.60
## 72 33 2.30
## 73 33 2.41
## 74 32 2.35
## 75 34 2.26
## 76 31 2.36
## 77 27 2.03
## 78 30 2.54
## 79 32 2.51
## 80 36 2.77
## 81 36 2.62
## 82 43 2.63
## 83 14 2.90
## 84 15 2.93
## 85 15 2.95
## 86 21 3.09
## 87 24 3.18
## 88 14 2.96
## 89 14 2.79
## 90 15 2.92
## 91 15 3.02
## 92 14 2.86
## 93 15 3.04
## 94 16 2.87
## 95 14 2.71
## 96 14 2.95
## 97 14 2.94
## 98 14 2.88
## 99 14 2.89
## 100 15 2.94
## 101 18 2.83
## 102 16 2.87
## 103 15 NA
## 104 19 3.06
## 105 19 3.09
## 106 24 2.03
## 107 22 3.14
## 108 11 2.41
## 109 23 2.28
## 110 28 2.45
## 111 28 2.38
## 112 14 2.25
## 113 18 2.07
## 114 28 2.35
## 115 28 2.27
## 116 27 2.29
## 117 26 3.07
## 118 8 2.56
## 119 8 2.64
## 120 17 3.03
## 121 10 2.65
## 122 10 2.52
## 123 25 3.12
## 124 8 2.63
## 125 17 2.92
## 126 20 2.98
## 127 9 2.56
## 128 7 2.59
## 129 27 3.15
## 130 8 2.48
## 131 8 2.68
## 132 15 3.00
## 133 10 2.59
## 134 7 2.53
## 135 9 2.52
## 136 9 NA
## 137 10 2.71
## 138 9 2.61
## 139 9 2.61
## 140 9 2.68
## 141 8 2.68
## 142 9 NA
## 143 10 NA
## 144 8 2.54
## 145 8 2.50
## 146 8 2.62
## 147 8 2.56
## 148 9 2.44
## 149 7 2.60
## 150 8 2.66
## 151 27 2.48
## 152 31 2.59
## 153 18 2.95
## 154 30 2.93
## 155 33 2.39
## 156 37 2.56
## 157 39 NA
## 158 18 NA
## 159 41 NA
## 160 47 NA
## 161 8 NA
## 162 30 NA
## 163 47 NA
## 164 47 NA
## 165 47 NA
## 166 38 NA
## 167 11 NA
## 168 47 NA
## 169 22 NA
## 170 31 NA
## 171 11 NA
## 172 36 NA
## 173 4 NA
## 174 2 NA
## 175 10 NA
## 176 2 NA
## 177 22 NA
## 178 44 NA
## 179 14 NA
## 180 3 NA
## 181 9 NA
## 182 10 NA
## 183 12 NA
## 184 5 NA
## 185 6 NA
## 186 4 NA
## 187 4 NA
## 188 30 NA
## 189 34 NA
## 190 3 NA
## 191 19 NA
## 192 35 NA
## 193 26 NA
## 194 29 NA
## 195 35 NA
## 196 17 NA
## 197 1 NA
## 198 46 NA
## 199 29 NA
## 200 34 NA
## 201 34 NA
## 202 24 NA
## 203 3 NA
## 204 10 NA
## 205 7 NA
## 206 2 NA
## 207 2 NA
## 208 2 NA
## 209 2 NA
## 210 8 NA
## 211 3 NA
## 212 2 NA
## 213 2 NA
## 214 2 NA
## 215 7 NA
## 216 8 NA
## 217 7 NA
## 218 1 NA
## 219 19 NA
## 220 12 NA
## 221 1 NA
## 222 8 NA
## 223 5 NA
## 224 1 NA
## 225 1 NA
## 226 24 NA
## 227 15 NA
## 228 19 NA
## 229 1 NA
## 230 19 NA
## 231 15 NA
## 232 24 NA
## 233 1 NA
## 234 28 NA
## 235 2 NA
## 236 14 NA
## 237 9 NA
## 238 14 NA
## 239 20 NA
## 240 1 NA
## 241 1 NA
## 242 20 NA
## 243 18 NA
## 244 15 NA
## 245 2 NA
## 246 19 NA
## 247 40 NA
## 248 16 NA
## 249 1 NA
## 250 14 NA
## 251 2 NA
## 252 15 NA
## 253 2 NA
## 254 2 NA
## 255 2 NA
## 256 3 NA
## 257 5 NA
## 258 2 NA
## 259 2 NA
## 260 2 NA
## 261 2 NA
## 262 2 NA
## 263 2 NA
## 264 2 NA
## 265 2 NA
## 266 2 NA
## 267 2 NA
## 268 2 NA
## 269 2 NA
## 270 2 NA
For practice, subset the dataset, define any categorical variables, and add any columns that might missing. Revist the previous sections, if you have forgotten what is needed!
Let’s refit the model from the earlier multiple explanatory variable section, but including the interaction between temperature and food level. We’ll immediately check the model is appropriate:
model <- lm(loglength ~ temp * food_level, data= wings)
par(mfrow=c(2,2), mar=c(3,3,1,1), mgp=c(2, 0.8,0))
plot(model)
Now, examine the anova and summary outputs for the model:
anova(model)
## Analysis of Variance Table
##
## Response: loglength
## Df Sum Sq Mean Sq F value Pr(>F)
## temp 1 0.56321 0.56321 186.8376 <2e-16 ***
## food_level 1 0.99144 0.99144 328.8969 <2e-16 ***
## temp:food_level 1 0.00588 0.00588 1.9522 0.1645
## Residuals 145 0.43709 0.00301
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Compared to the model from the first multiple explanatory variables section, there is an extra line at the bottom. The top two are the same and show that temperature and food level both have independent main effects. The extra line shows that there is also an interaction between the two. It doesn’t explain very much variation, and it is non-significant.
Again, we can calculate the \(r^2\) for the model:
\(\frac{0.59 + 0.97 + 0.01}{0.59 + 0.97 + 0.01 + 0.43} = 0.78\)
The model from the first multiple explanatory variables section without the interaction had an \(r^2 = 0.78\) — our new model explains 0% more of the variation in the data.
The summary table is as follows:
summary(model)
##
## Call:
## lm(formula = loglength ~ temp * food_level, data = wings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.195160 -0.025331 0.003657 0.031510 0.130390
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.275682 0.058950 21.640 < 2e-16 ***
## temp -0.015031 0.002293 -6.556 9.08e-10 ***
## food_level 0.284828 0.072615 3.922 0.000135 ***
## temp:food_level -0.003913 0.002801 -1.397 0.164487
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0549 on 145 degrees of freedom
## Multiple R-squared: 0.7812, Adjusted R-squared: 0.7767
## F-statistic: 172.6 on 3 and 145 DF, p-value: < 2.2e-16
The lines in this output are:
The first four lines, as in the model from the ANOVA section, which would allow us to find the predicted values for each group if the size of the differences did not vary between levels because of the interactions. That is, this part of the model only includes a single difference low-food and high-food, which has to be the same for each temperature because it ignores interactions between temperature and low- / high-food identity of each treatment. The last two lines then give the estimated coefficients associated with the interaction terms, and allow cause the size of differences to vary between levels because of the further effects of interactions.
The table below show how these combine to give the predictions for each group combination, with those two new lines show in red:
| low food | high food | |
|---|---|---|
| 22oC | 0.97 = 0.97 | 0.97 + 0.17 = 1.14 |
| 26oC | 0.97 - 0.08 = 0.89 | 0.97 - 0.08 + 0.17 + 0.01 = 1.07 |
| 32oC | 0.97 - 0.15 = 0.82 | 0.97 - 0.15 + 0.17 - 0.04 = 0.95 |
So why are there two new coefficients? For interactions between two factors, there are always \((n-1)\times(m-1)\) new coefficients, where \(n\) and \(m\) are the number of levels in the two factors (low- or high-food: 2 levels and temperature: 3 levels, in our current example). So in this model, \((3-1) \times (2-1) =2\). It might be easier to understand why graphically:
image
The prediction for the white boxes above can be found by adding the main effects together but for the grey boxes, we need to find specific differences. So there are \((n-1)\times(m-1)\) interaction coefficients (count the number of grey boxes) to add.
If we put this together, what is the model telling us?
This finding suggests that food limitation can exacerbate the negative effect that increased temperature has on body size. However, the interaction term in the anova output suggests that variation between food levels in degree to which body size decreases with temperature is non-significant.
\(\star\) Copy the code above into your script and run the model.
Make sure you understand the output!
Just to make sure the sums above are correct, we’ll use the same code as in earlier multiple explanatory variables section to get R to calculate predictions for us, similar to the way we did before:
Again, remember that the each = 2 option repeats each value twice in succession; the times = 3 options repeats the whole set of values (the whole vector) three times.
\(\star\) Include and run the code for gererating these predictions in your script.
If we plot these data points onto the barplot from the first multiple explanatory variables section, they now lie exactly on the mean values, because we’ve allowed for interactions. How much overlap would you expect, if you were to plot the predictions from the model without the interaction term?
We’ll go all the way back to the regression analyses from the Regression section. Remember that we fitted two separate regression lines to the data for damselflies and dragonflies. We’ll now use an interaction to fit these in a single model. This kind of linear model — with a mixture of continuous variables and factors — is often called an analysis of covariance, or ANCOVA. That is, ANCOVA is a type of linear model that blends ANOVA and regression. ANCOVA evaluates whether population means of a dependent variable are equal across levels of a categorical independent variable, while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates.
Thus, ANCOVA is a linear model with one categorical and one or more continuous predictors.
We will use the odonates data.
\(\star\) First load the data:
odonata <- read.csv('../data/GenomeSize.csv')
\(\star\) Now create two new variables in the odonata data set called logGS and logBW containing log genome size and log body weight:
odonata$logGS <- log(odonata$GenomeSize)
odonata$logBW <- log(odonata$BodyWeight)
The models we fitted before looked like this:
image
We can now fit the model of body weight as a function of both genome size and suborder:
odonModel <- lm(logBW ~ logGS * Suborder, data = odonata)
summary(odonModel)
##
## Call:
## lm(formula = logBW ~ logGS * Suborder, data = odonata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3243 -0.3225 0.0073 0.3962 1.4976
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.39947 0.08475 -28.311 < 2e-16 ***
## logGS 1.00522 0.22367 4.494 1.99e-05 ***
## SuborderZygoptera -2.24895 0.13540 -16.610 < 2e-16 ***
## logGS:SuborderZygoptera -2.14919 0.46186 -4.653 1.07e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6499 on 94 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.7549, Adjusted R-squared: 0.7471
## F-statistic: 96.5 on 3 and 94 DF, p-value: < 2.2e-16
Again, we’ll look at the anova table first:
anova(odonModel)
## Analysis of Variance Table
##
## Response: logBW
## Df Sum Sq Mean Sq F value Pr(>F)
## logGS 1 1.144 1.144 2.710 0.1031
## Suborder 1 111.968 111.968 265.133 < 2.2e-16 ***
## logGS:Suborder 1 9.145 9.145 21.654 1.068e-05 ***
## Residuals 94 39.697 0.422
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpreting this:
There is no significant main effect of log genome size. The main effect is the important thing here — genome size is hugely important but does very different things for the two different suborders. If we ignored Suborder, there isn’t an overall relationship: the average of those two lines is pretty much flat.
There is a very strong main effect of Suborder: the mean body weight in the two groups are very different.
There is a strong interaction between suborder and genome size. This is an interaction between a factor and a continuous variable and shows that the slopes are different for the different factor levels.
Now for the summary table:
summary(odonModel)
##
## Call:
## lm(formula = logBW ~ logGS * Suborder, data = odonata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3243 -0.3225 0.0073 0.3962 1.4976
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.39947 0.08475 -28.311 < 2e-16 ***
## logGS 1.00522 0.22367 4.494 1.99e-05 ***
## SuborderZygoptera -2.24895 0.13540 -16.610 < 2e-16 ***
## logGS:SuborderZygoptera -2.14919 0.46186 -4.653 1.07e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6499 on 94 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.7549, Adjusted R-squared: 0.7471
## F-statistic: 96.5 on 3 and 94 DF, p-value: < 2.2e-16
The first thing to note is that the \(r^2\) value is really high. The model explains three quarters (0.752) of the variation in the data.
Next, there are four coefficients:
Suborder, which is Anisoptera (dragonflies).log genome size, is the slope for Anisoptera.Suborder, which is Zygoptera (damselflies). As with the first model, this difference in factor levels is a difference in mean values and shows the difference in the intercept for Zygoptera.Suborder and logGS. This shows how the slope for Zygoptera differs from the slope for Anisoptera.How do these hang together to give the two lines shown in the model? We can calculate these by hand:
\[\begin{align*} \textrm{Body Weight} &= -2.40 + 1.01 \times \textrm{logGS} & \textrm{[Anisoptera]}\\ \textrm{Body Weight} &= (-2.40 -2.25) + (1.01 - 2.15) \times \textrm{logGS} & \textrm{[Zygoptera]}\\ &= -4.65 - 1.14 \times \textrm{logGS} \end{align*}\]
\(\star\) Add the above code into your script and check that you understand the outputs.
We’ll use the predict function again to get the predicted values from the model and add lines to the plot above.
First, we’ll create a set of numbers spanning the range of genome size:
#get the range of the data:
rng <- range(odonata$logGS)
#get a sequence from the min to the max with 100 equally spaced values:
LogGSForFitting <- seq(rng[1], rng[2], length = 100)
Have a look at these numbers:
print(LogGSForFitting)
## [1] -0.891598119 -0.873918728 -0.856239337 -0.838559945 -0.820880554
## [6] -0.803201163 -0.785521772 -0.767842380 -0.750162989 -0.732483598
## [11] -0.714804206 -0.697124815 -0.679445424 -0.661766032 -0.644086641
## [16] -0.626407250 -0.608727859 -0.591048467 -0.573369076 -0.555689685
## [21] -0.538010293 -0.520330902 -0.502651511 -0.484972119 -0.467292728
## [26] -0.449613337 -0.431933946 -0.414254554 -0.396575163 -0.378895772
## [31] -0.361216380 -0.343536989 -0.325857598 -0.308178207 -0.290498815
## [36] -0.272819424 -0.255140033 -0.237460641 -0.219781250 -0.202101859
## [41] -0.184422467 -0.166743076 -0.149063685 -0.131384294 -0.113704902
## [46] -0.096025511 -0.078346120 -0.060666728 -0.042987337 -0.025307946
## [51] -0.007628554 0.010050837 0.027730228 0.045409619 0.063089011
## [56] 0.080768402 0.098447793 0.116127185 0.133806576 0.151485967
## [61] 0.169165358 0.186844750 0.204524141 0.222203532 0.239882924
## [66] 0.257562315 0.275241706 0.292921098 0.310600489 0.328279880
## [71] 0.345959271 0.363638663 0.381318054 0.398997445 0.416676837
## [76] 0.434356228 0.452035619 0.469715011 0.487394402 0.505073793
## [81] 0.522753184 0.540432576 0.558111967 0.575791358 0.593470750
## [86] 0.611150141 0.628829532 0.646508923 0.664188315 0.681867706
## [91] 0.699547097 0.717226489 0.734905880 0.752585271 0.770264663
## [96] 0.787944054 0.805623445 0.823302836 0.840982228 0.858661619
We can now use the model to predict the values of body weight at each of those points for each of the two suborders:
#get a data frame of new data for the order
ZygoVals <- data.frame(logGS = LogGSForFitting, Suborder = "Zygoptera")
#get the predictions and standard error
ZygoPred <- predict(odonModel, newdata = ZygoVals, se.fit = TRUE)
#repeat for anisoptera
AnisoVals <- data.frame(logGS = LogGSForFitting, Suborder = "Anisoptera")
AnisoPred <- predict(odonModel, newdata = AnisoVals, se.fit = TRUE)
We’ve added se.fit=TRUE to the function to get the standard error around the regression lines. Both AnisoPred and ZygoPred contain predicted values (called fit) and standard error values (called se.fit) for each of the values in our generated values in LogGSForFitting for each of the two suborders.
We can add the predictions onto a plot like this:
# plot the scatterplot of the data
plot(logBW ~ logGS, data = odonata, col = Suborder)
# add the predicted lines
lines(AnisoPred$fit ~ LogGSForFitting, col = "black")
lines(AnisoPred$fit + AnisoPred$se.fit ~ LogGSForFitting, col = "black", lty = 2)
lines(AnisoPred$fit - AnisoPred$se.fit ~ LogGSForFitting, col = "black", lty = 2)
\(\star\) Copy the prediction code into your script and run the plot above.
Copy and modify the last three lines to add the lines for the Zygoptera. Your final plot should look like this:
image
In biology, we often use statistics to compare competing hypotheses in order to work out the simplest explanation for some data. This often involves collecting several explanatory variables that describe different hypotheses and then fitting them together in a single model, and often including interactions between those variables.
In all likelihood, not all of these model terms will be important. If we remove unimportant terms, then the explanatory power of the model will get worse, but might not get significantly worse.
“It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.”
– Albert Einstein
Or to paraphrase:
“Everything should be made as simple as possible, but no simpler.”
The approach we will look at is to start with a maximal model — the model that contains everything that might be important — and simplify it towards the null model — the model that says that none of your variables are important. Hopefully, there is a point somewhere in between where you can’t remove any further terms without making the model significantly worse: this is called the minimum adequate model.
image
The main aim of this section\(^{[1]}\) is to learn how to build and then simplify complex models by removing non-explanatory terms, to arrive at the Minimum Adequate Model.
Model simplification is an iterative process. The flow diagram below shows how it works: at each stage you try and find an acceptable simplification. If successful, then you start again with the new simpler model and try and find a way to simplify this, until eventually, you can’t find anything more to remove.
image
As always, we can use an \(F\)-test to compare two models and see if they have significantly different explanatory power (there are also other ways to do this, such as using the Akaike Information Criterion, but we will not cover this here). In this context, the main thing to remember is that significance of the \(F\)-test used to compare a model and its simplified counterpart is a bad thing — it means that we’ve removed a term from the fitted model that makes it significantly worse.
We’ll be using the wing length dataset for this practical, so once again:
\(\star\) Make sure you have changed the working directory to your stats module code folder.
\(\star\) Create a new blank script called MyModelSimp.R.
\(\star\) Load the wing length data into a data frame called wings:
wings <- read.csv('../data/traitdata_Huxleyetal_2021.csv',stringsAsFactors = TRUE)
In previous sections, we looked at how the categorical variables temp and food_level predicted wing length in in Aedes aegypti. In this section, we will add in another continuous variable: adult lifespan. The first thing we will do is to log both variables and reduce the dataset to the rows for which all of these data are available:
wings$temp <- as_factor(wings$temp)
wings$food_level <- as_factor(wings$food_level)
wings$loglength <- log(wings$length_mm)
wings$loglifespan <- log(wings$adult_lifespan)
# reduce dataset to four key variables
wings <- subset(wings, select = c(loglength, loglifespan, temp, food_level))
# remove the row with missing data
wings <- na.omit(wings)
\(\star\) Copy the code above into your script and run it
Check that the data you end up with has this structure:
str(wings)
## 'data.frame': 149 obs. of 4 variables:
## $ loglength : num 1.075 0.952 0.88 0.963 0.824 ...
## $ loglifespan: num 2.08 1.61 1.61 1.79 1.39 ...
## $ temp : Factor w/ 3 levels "22","26","32": 1 1 1 1 1 1 1 1 1 1 ...
## $ food_level : Factor w/ 2 levels "0.1","1": 1 1 1 1 1 1 1 1 1 1 ...
## - attr(*, "na.action")= 'omit' Named int 38 42 60 103 136 142 143 157 158 159 ...
## ..- attr(*, "names")= chr "38" "42" "60" "103" ...
First let’s fit a model including all of these variables and all of the interactions:
model <- lm(formula = loglifespan ~ loglength * temp * food_level, data = wings)
\(\star\) Add this model-fitting step in your script.
\(\star\) Look at the output of anova(model) and summary(model).
Scared? Don’t be! There are a number of points to this exercise:
These tables show exactly the kind of output you’ve seen before. Sure, there are lots of rows but each row is just asking whether a model term (anova) or a model coefficient (summary) is significant.
Some of the rows are significant, others aren’t: some of the model terms are not explanatory.
The two tables show similar things - only a few stars for the anova table and the summary table.
That last line in the anova table: loglength:temp:food_level. This is an interaction of three variables capturing how the slope for lifespan changes for different wing lengths for individuals in different temperature-food levels. Does this seem easy to understand?
The real lesson here is that it is easy to fit complicated models in R.
Understanding and explaining them is a different matter.
The temptation is always to start with the most complex possible model but this is rarely a good idea.
Instead of all possible interactions, we’ll consider two-way interactions: how do pairs of variables affect each other?
There is a shortcut for this: y ~ (a + b + c)^2 gets all two way combinations of the variables in the brackets, so is a quicker way of getting this model:
y ~ a + b + c + a:b + a:c + b:c
So let’s use this to fit a simpler maximal model:
model <- lm(loglifespan ~ (loglength + temp + food_level)^2, data = wings)
The anova table for this model looks like this:
anova(model)
## Analysis of Variance Table
##
## Response: loglifespan
## Df Sum Sq Mean Sq F value Pr(>F)
## loglength 1 29.9465 29.9465 299.2418 < 2.2e-16 ***
## temp 2 17.5835 8.7918 87.8522 < 2.2e-16 ***
## food_level 1 0.0488 0.0488 0.4875 0.486215
## loglength:temp 2 1.3925 0.6963 6.9573 0.001319 **
## loglength:food_level 1 0.1193 0.1193 1.1921 0.276787
## temp:food_level 2 0.5558 0.2779 2.7768 0.065693 .
## Residuals 139 13.9104 0.1001
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The first lines are the main effects: two are significant and one isn’t. Then there are the three interactions. One of these is very significant: loglength:temp, which suggests that the slope of log lifespan value with wing length differs between temperatures. The other interactions are non-significant although some are close.
\(\star\) Run this model in your script.
\(\star\) Look at the output of anova(model) and summary(model).
\(\star\) Generate and inspect the model diagnostic plots.
Now let’s simplify the model we fitted above. Model simplification is not as straightforward as just dropping terms. Each time you remove a term from a model, the model will change: the model will get worse, since some of the sums of squares are no longer explained, but the remaining variables may partly compensate for this loss of explanatory power. The main point is that if it gets only a little worse, its OK, as the tiny amount of additional variation explained by the term you removed is not really worth it.
But how much is “tiny amount”? This is what we will learn now by using the \(F\)-test. Again, remember: significance of the \(F\)-test used to compare a model and its simplified counterpart is a bad thing — it means that we’ve removed a term from the fitted model that makes the it significantly worse.
The first question is: what terms can you remove from a model? Obviously, you only want to remove non-significant terms, but there is another rule – you cannot remove a main effect or an interaction while those main effects or interactions are present in a more complex interaction. For example, in the model y ~ a + b + c + a:b + a:c + b:c, you cannot drop c without dropping both a:c and b:c.
The R function drop.scope tells you what you can drop from a model. Some examples:
drop.scope(model)
## [1] "loglength:temp" "loglength:food_level" "temp:food_level"
drop.scope(y ~ a + b + c + a:b)
## [1] "c" "a:b"
drop.scope(y ~ a + b + c + a:b + b:c + a:b:c)
## [1] "a:b:c"
The last thing we need to do is work out how to remove a term from a model. We could type out the model again, but there is a shortcut using the function update:
# a simple model
f <- y ~ a + b + c + b:c
# remove b:c from the current model
update(f, . ~ . - b:c)
## y ~ a + b + c
# model g as a response using the same explanatory variables.
update(f, g ~ .)
## g ~ a + b + c + b:c
Yes, the syntax is weird. The function uses a model or a formula and then allows you to alter the current formula. The dots in the code . ~ . mean ‘use whatever is currently in the response or explanatory variables’. It gives a simple way of changing a model.
Now that you have learned the syntax, let’s try model simplification with the mammals dataset.
From the above anova and drop.scope output, we know that the interaction loglength:temp:food_level is not significant and a valid term. So, let’s remove this term:
model2 <- update(model, . ~ . - loglength:temp:food_level)
And now use ANOVA to compare the two models:
anova(model, model2)
## Analysis of Variance Table
##
## Model 1: loglifespan ~ (loglength + temp + food_level)^2
## Model 2: loglifespan ~ loglength + temp + food_level + loglength:temp +
## loglength:food_level + temp:food_level
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 139 13.91
## 2 139 13.91 0 0
This tells us that model2 is not significantly worse than model. That is, dropping that one interaction term did not result in much of a loss of predictability.
Now let’s look at this simplified model and see what else can be removed:
anova(model2)
## Analysis of Variance Table
##
## Response: loglifespan
## Df Sum Sq Mean Sq F value Pr(>F)
## loglength 1 29.9465 29.9465 299.2418 < 2.2e-16 ***
## temp 2 17.5835 8.7918 87.8522 < 2.2e-16 ***
## food_level 1 0.0488 0.0488 0.4875 0.486215
## loglength:temp 2 1.3925 0.6963 6.9573 0.001319 **
## loglength:food_level 1 0.1193 0.1193 1.1921 0.276787
## temp:food_level 2 0.5558 0.2779 2.7768 0.065693 .
## Residuals 139 13.9104 0.1001
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop.scope(model2)
## [1] "loglength:temp" "loglength:food_level" "temp:food_level"
\(\star\) Add this first simplification to your script and re-run it.
\(\star\) Look at the output above and decide what is the next possible term to delete
\(\star\) Using the code above as a model, create model3 as the next simplification. (remember to use model2 in your update call and not model).
Now for a more difficult exercise:
\(\star\) Using the code above to guide you, try and find a minimal adequate model that you are happy with. In each step, the output of anova(model, modelN) should be non-significant (where \(N\) is the current step).
It can be important to consider both anova and summary tables. It can be worth trying to remove things that look significant in one table but not the other — some terms can explain significant variation on the anova table but the coefficients are not significant.
Remember to remove terms: with categorical variables, several coefficients in the summary table may come from one term in the model and have to be removed together.
When you have got your final model, save the model as an R data file: save(modelN, file='myFinalModel.Rda').